Nucleotide analogs

ABSTRACT

Provided herein is technology relating to the manipulation and detection of nucleic acids, including but not limited to compositions, methods, and kits related to nucleotides comprising a chemically reactive linking moiety.

This Application is a continuation of U.S. patent application Ser. No.16/752,865, filed Jan. 27, 2020, which is a continuation of U.S.application Ser. No. 15/944,553 filed Apr. 3, 2018, now U.S. Pat. No.10,577,646, issued Mar. 3, 2020, which is a continuation of U.S.application Ser. No. 14/463,412 filed Aug. 19, 2014, now U.S. Pat. No.9,932,623, issued Apr. 3, 2018, which claims priority to U.S.provisional patent application Ser. No. 61/867,202, filed Aug. 19, 2013,each of which are herein incorporated by reference in their entirety.

FIELD OF INVENTION

Provided herein is technology relating to the manipulation and detectionof nucleic acids, including but not limited to compositions, methods,and kits related to nucleotides comprising a chemically reactive linkingmoiety.

BACKGROUND

Nucleic acid detection methodologies continue to serve as a criticaltool in the field of molecular diagnostics. The ability to manipulatebiomolecules specifically and efficiently provides the basis for manysuccessful detection technologies. For example, linking a chemical,biological, or physical moiety (e.g., adding a “tag”) to a biomoleculeof interest is one key technology related to the subsequentmanipulation, detection, and/or identification of the biomolecule.

Conventional linking technologies often rely on enzyme-assisted methods.For example, some methods to append a desired tag onto a target DNA usea ligase enzyme to join the target DNA to the tag (e.g., another DNAfragment comprising the tag, another DNA fragment to serve as the tagitself, etc.). In another method, a polymerase enzyme incorporates atag-modified substrate of the polymerase (e.g., a dNTP or amodified-dNTP) into a nucleic acid. An advantage of theseenzyme-assisted methods is that the links joining the biomolecule to themoiety are “natural” linkages that allow further manipulation of theconjugated product. However, some important drawbacks include lowproduct yields, inefficient reactions, and low specificity due tomultiple reactive groups present on a target biomolecule that the enzymecan recognize. In addition, conventional methods have high costs in bothtime and money.

SUMMARY

Accordingly, provided herein is technology related to linking moietiesto biomolecules using chemical conjugation. These linkage reactions aremore specific and efficient that conventional technologies because thereactions are designed to include a mechanism of conjugation betweenspecific chemical moieties.

While most conventional chemical covalent linkages are not recognizedand/or processed by biological catalysts (e.g., enzymes), thus limitingsubsequent manipulation of the conjugated product, the technologydescribed herein provides a chemical linkage that allows downstreammanipulation of the conjugated product by standard molecular biologicaland biochemical techniques.

For example, while there are many nucleotide analogs currently availablethat can terminate a polymerase reaction (e.g., dideoxynucleotides andvarious 3′ modified nucleotide analogs), these molecules inhibit orseverely limit further manipulation of nucleic acids terminated by theseanalogs. For example, subsequent enzymatic reactions such as thepolymerase chain reaction are completely or substantially inhibited bythe nucleotide analogs. In addition, some solutions have utilizednucleotide analogs called “reversible terminators” in which the 3′hydroxyl groups are capped with a chemical moiety that can be removedwith a specific chemical reaction, thus regenerating a free 3′ hydroxyl.Use of these nucleotide analogs, however, requires the additionaldeprotection (uncapping) step to remove the protecting (capping) moietyfrom the nucleic acid as well as an additional purification step toremove the released protecting (capping) moiety from the reactionmixture.

In contrast to conventional technologies, provided herein is technologyrelated to the design, synthesis, and use of nucleotide (e.g.,ribonucleotide, deoxyribonucleotide) analogs that comprise chemicallyreactive groups. For example, some embodiments provide a nucleotideanalog comprising an alkyne group, e.g., a nucleotide comprising a 3′alkyne group such as provided in embodiments of the technology relatedto a 3′-O-propargyl deoxynucleotides. The chemical groups and linkagesdo not impair or significantly limit the use of subsequent molecularbiological techniques to manipulate compounds (e.g., nucleic acids,conjugates, and other biomolecules) comprising the nucleotide analogs.As such, the compounds (e.g., nucleic acids, conjugates, and otherbiomolecules) comprising the described nucleotide analogs are useful formany applications.

In some embodiments, nucleotide analogs find use as functionalnucleotide terminators, that is, the nucleotide analogs terminatesynthesis of a nucleic acid by a polymerase and additionally comprise afunctional reactive group for subsequent chemical and/or biochemicalprocessing, reaction, and/or manipulation. In particular, someembodiments provide a nucleotide analog in which the 3′ hydroxyl groupis capped by a chemical moiety comprising, e.g., an alkyne (e.g., acarbon-carbon triple bond, e.g., C≡C). When the 3′ alkyne nucleotideanalog is incorporated into a nucleic acid by a polymerase (e.g., a DNAand/or RNA polymerase) during synthesis, further elongation of thenucleic acid is halted (“terminated”) because the nucleic acid does nothave a free 3′ hydroxyl to provide the proper substrate for subsequentnucleotide addition.

While the nucleotide analogs are not a natural substrate forconventional molecular biological enzymes, the alkyne chemical moiety isa well-known chemical conjugation partner reactive with particularfunctional moieties. For example, an alkyne reacts with an azide group(e.g., N₃, e.g., N═N═N) in a copper (I)-catalyzed azide-alkynecycloaddition (“CuAAC”) reaction to form two new covalent bonds betweenazide nitrogens and alkyl carbons. The covalent bonds form a chemicallink (e.g., comprising a five-membered triazole ring) between a firstcomponent and a second component that comprised the azide and the alkynemoieties before linkage. This type of cycloaddition reaction is one ofthe foundational reactions of “click chemistry” because it provides adesirable chemical yield, is physiologically stable, and exhibits alarge thermodynamic driving force that favors a “spring-loaded” reactionthat yields a single product (e.g., a 1,4-regioisomer of1,2,3-triazole). See, e.g., Huisgen (1961) “CentenaryLecture—1,3-Dipolar Cycloadditions”, Proceedings of the Chemical Societyof London 357; Kolb, Finn, Sharpless (2001) “Click Chemistry: DiverseChemical Function from a Few Good Reactions”, Angewandte ChemieInternational Edition 40(11): 2004-2021. For example:

where R₁ and R₂ are individually any chemical structure or chemicalmoiety.

The reaction can be performed in a variety of solvents, includingaqueous mixtures, compositions comprising water and/or aqueous mixtures,and a variety of organic solvents including compositions comprisingalcohols, dimethyl sulfoxide (DMSO), dimethylformamide (DMF), tert-butylalcohol (TBA or tBuOH; also known as 2-methyl-2-propanol (2M2P)), andacetone. In some embodiments, the reaction is performed in a milieucomprising a copper-based catalyst such as Cu/Cu(OAc)₂, a tertiary aminesuch as tris-(benzyltriazolylmethyl)amine (TBTA), and/or tetrahydrofuranand acetonitrile (THF/MeCN).

In some embodiments, the triazole ring linkage has a structure accordingto:

where R₁ and R₂ are individually any chemical structure or chemicalmoiety (and may be the same or different chemical structures or chemicalmoieties in different structures) and B, B₁, and B₂ individuallyindicate the base of the nucleotide (e.g., adenine, guanine, thymine,cytosine, or a natural or synthetic nucleobase, e.g., a modified purinesuch as hypoxanthine, xanthine, 7-methylguanine; a modified pyrimidinesuch as 5,6-dihydrouracil, 5-methylcytosine, 5- hydroxymethylcytosine;etc.).

The triazole ring linkage formed by the alkyne-azide cycloaddition hassimilar characteristics (e.g., physical, biological, biochemical,chemical characteristics, etc.) as a natural phosphodiester bond presentin nucleic acids and therefore is a nucleic acid backbone mimic.Consequently, conventional enzymes that recognize natural nucleic acidsas substrates also recognize as substrates the products formed byalkyne-azide cycloaddition as provided by the technology describedherein. See, e.g., El-Sagheer et al. (2011) “Biocompatible artificialDNA linker that is read through by DNA polymerases and is functional inEscherichia coli”, Proc Natl Acad Sci USA108(28): 11338-43.

In some embodiments, the use of nucleotide analogs comprising an alkyne(e.g., a 3′-O-propargyl nucleotide analog) produces nucleic acids (e.g.,DNA or RNA polynucleotide fragments) that have a terminal 3′ alkynegroup. For example, in some embodiments, nucleotide analogs comprisingan alkyne (e.g., a 3′-O-propargyl nucleotide analog) are incorporatedinto a growing strand of a nucleic acid in a polymerase extensionreaction; once incorporated, the nucleotide analogs halt the polymerasereaction. These terminated nucleic acids are an appropriate chemicalreactant for a click chemistry reaction (e.g., alkyne-azidecycloaddition), e.g., for a chemical ligation to an azide-modifiedmolecule such as a 5′-azide modified nucleic acid, a labeling moietycomprising an azide, a solid support comprising an azide, a proteincomprising an azide, etc., including, but not limited to moieties,entities, and components discussed herein. In some embodiments, forexample, the 3′-O-propargyl group at the 3′ terminal of the nucleic acidproduct is used in a tagging reaction with an azide-modified tag usingchemical ligation, e.g., as provided by a click chemistry reaction. Thecovalent linkage created using this chemistry mimics that of a naturalnucleic acid phosphodiester bond, thereby providing for the use of thechemically ligated nucleic acids in subsequent enzymatic reactions, suchas a polymerase chain reaction, with the triazole chemical linkagecausing minimal, limited, or undetectable (e.g., no) inhibition of theenzymatic reaction.

In some embodiments, the nucleotide analog comprising an alkyne isreacted with a reactant comprising a phosphine moiety in a Staudingerligation. In a Staudinger ligation, an electrophilic trap (e.g., amethyl ester) is placed on a triarylphosphine aryl group (usually orthoto the phosphorus atom) and reacted with the azide to yield an aza-ylideintermediate, which then rearranges (e.g., in aqueous media) to producea compound with amide group and a phosphine oxide function. TheStaudinger ligation ligates (attaches and covalently links) the twostarting molecules together.

Accordingly, provided herein is technology related to a compositioncomprising a nucleotide analog having a structure according to:

wherein B is a base and P comprises a phosphate moiety. In someembodiments, P comprises a tetraphosphate; a triphosphate; adiphosphate; a monophosphate; a 5′ hydroxyl; an alpha thiophosphate(e.g., phosphorothioate or phosphorodithioate), a beta thiophosphate(e.g., phosphorothioate or phosphorodithioate), and/or a gammathiophosphate (e.g., phosphorothioate or phosphorodithioate); or analpha methylphosphonate, a beta methylphosphonate, and/or a gammamethylphosphonate.

In some embodiments, P comprises an azide (e.g., N₃, e.g., N═N═N), thusproviding, in some embodiments, a directional, bi-functionalpolymerization agent as described herein.

In some embodiments, B is a cytosine, guanine, adenine, thymine, oruracil base. That is, in some embodiments, B is a purine or a pyrimidineor a modified purine or a modified pyrimidine. The technology is notlimited in the bases B that find use in the nucleotide analogs. Forexample, B can be any synthetic, artificial, or natural base; thus, insome embodiments B is a synthetic base; in some embodiments, B is anartificial base; in some embodiments, B is a natural base. In someembodiments, compositions comprise a nucleotide analog and a nucleicacid (e.g., a polynucleotide). Compositions in some embodiments furthercomprise a polymerase and/or a nucleotide (e.g., a conventionalnucleotide). In compositions comprising a nucleotide and a nucleotideanalog, in some embodiments the number ratio of the nucleotide analog tothe nucleotide is 1:1, 1:2, 1:3, 1:4, 1:5, 1:10, 1:15, 1:20, 1:25, 1:30,1:50, 1:75, 1:100, 1:200, 1:300, 1:400, 1:500, 1:600, 1:700, 1:800,1:900, 1:1000, 1:5000, or 1:10000.

In some embodiments, a nucleic acid comprises a nucleotide analog asprovided herein. In some embodiments, the nucleic acid comprises thenucleotide analog at its 3′ end (e.g., the nucleotide analog is at the3′ end of the nucleic acid). The technology, in some embodiments relatesto the synthesis of a nucleic acid comprising a nucleotide analog by abiological enzyme. That is, the biological enzyme recognizes thenucleotide analog as a substrate and incorporates the nucleotide analoginto the nucleic acid. For example, in some embodiments, the nucleicacid is produced by a polymerase.

In some embodiments, the compositions further comprise an azide, e.g., acomponent, entity, molecule, surface, biomolecule, etc., comprising anazide.

In some embodiments, the compositions comprise multiple nucleic acids;accordingly, in some embodiments, compositions comprise a second nucleicacid (e.g., in addition to a nucleic acid comprising a nucleotideanalog). The technology encompasses functionalized nucleic acids forreacting with a nucleic acid comprising a nucleotide analog. Thus, insome embodiments, the second nucleic acid comprises an azide moiety,e.g., in some embodiments, the second nucleic acid comprises an azidemoiety at the 5′ end of the second nucleic acid.

The technology is not limited in the entity (e.g., comprising an azidegroup) reacted with the nucleic acid comprising the nucleotide analog.For instance, in some embodiments, compositions further comprise a labelcomprising an azide, a tag comprising an azide, a solid supportcomprising an azide, a nucleotide comprising an azide, a biotincomprising an azide, or a protein comprising an azide. In someembodiments, an alkyne moiety and an azide moiety are reacted using a“click chemistry” reaction catalyzed by a copper-based catalyst. Assuch, in some embodiments compositions further comprise a copper-basedcatalyst reagent. The reaction of the azide and alkyne produces, in someembodiments, a triazole moiety. In some embodiments, a nucleic acidcomprising an alkyne (e.g., a nucleic acid comprising a nucleotideanalog comprising an alkyne) is reacted with a nucleic acid comprisingan azide to produce a longer nucleic acid. As such, in some embodimentscompositions according to the technology further comprise a nucleic acidcomprising a triazole (e.g., that forms a link between the two nucleicacids). In some embodiments, the reaction of the alkyne and azideproceeds with regioselectivity, e.g., in some embodiments the nucleicacid comprises a 1′, 4′ substituted triazole. In some embodiments, thenucleic acid comprising the nucleotide analog is reacted with an adaptoroligonucleotide, an adaptor oligonucleotide comprising a barcode, or abarcode oligonucleotide comprising an azide. Thus, in some embodimentsare provided reaction mixtures comprising an adaptor oligonucleotide, anadaptor oligonucleotide comprising a barcode, or a barcodeoligonucleotide.

In some embodiments, a nucleic acid (e.g., formed from uniting twonucleic acids by “click chemistry” reaction of an alkyne and an azide)comprises a structure according to:

where R₁ and R₂ are individually any chemical structure or chemicalmoiety (and may be the same or different chemical structures or chemicalmoieties in different structures) and B₁ and B₂ individually indicatethe base of the nucleotide (e.g., adenine, guanine, thymine, cytosine,or a natural or synthetic nucleobase, e.g., a modified purine such ashypoxanthine, xanthine, 7-methylguanine; a modified pyrimidine such as5,6-dihydrouracil, 5-methylcytosine, 5-hydroxymethylcytosine; etc.).

Another aspect of the technology relates to embodiments of methods forsynthesizing a modified nucleic acid, the method comprising providing anucleotide analog comprising an alkyne group and linking a nucleic acidto the nucleotide analog to produce a modified nucleic acid comprisingthe nucleotide analog. In some embodiments, the nucleotide analog has astructure according to:

wherein B is a base (e.g., cytosine, guanine, adenine, thymine, oruracil) and P comprises a triphosphate moiety. Embodiments of the methodcomprise further providing, e.g., a template, a primer, a nucleotide(e.g., a conventional nucleotide), and/or a polymerase. The nucleotideanalogs are recognized as a substrate by biological enzymes such aspolymerases; thus, in some embodiments, a polymerase catalyzes linking anucleic acid to the nucleotide analog to produce a modified nucleic acidcomprising the nucleotide analog. The modified nucleic acid provides asubstrate for reaction with an azide-carrying entity, e.g., to form aconjugated product by a “click chemistry” reaction. Thus, in someembodiments the methods further comprise reacting the modified nucleicacid with an azide moiety. The methods are not limited in the entitythat comprises the azide moiety; for example, in some embodiments themethods comprise reacting the modified nucleic acid with a secondnucleic acid comprising an azide moiety, e.g., reacting the modifiednucleic acid with a second nucleic acid comprising an azide moiety atthe 5′ end of the second nucleic acid, a label comprising an azide, atag comprising an azide, a solid support comprising an azide, anucleotide comprising an azide, and/or a protein comprising an azide.

The methods find use in linking an adaptor oligonucleotide (e.g., foruse in next-generation sequencing) to a nucleic acid comprising anucleotide analog. Accordingly, in some embodiments, the methods furthercomprise reacting the modified nucleic acid with an adaptoroligonucleotide comprising an azide moiety, an adaptor oligonucleotidecomprising a barcode and comprising an azide moiety, and/or a barcodeoligonucleotide comprising an azide moiety, e.g., to produce a nucleicacid-oligonucleotide conjugate. In some embodiments, reactions of anucleotide analog (e.g., a nucleic acid comprising a nucleotide analog)and an azide are catalyzed by a copper-based catalyst reagent.Associated methods, according, in some embodiments comprise reacting themodified nucleic acid with an azide moiety and a copper-based catalystreagent. As the triazole ring formed by the “click chemistry” reactiondoes not substantially and/or detectably inhibit biological enzymeactivity, the nucleic acid-oligonucleotide conjugate provides a usefulnucleic acid for further manipulation, e.g., in some embodiments themodified nucleic acid is a substrate for a biological enzyme, themodified nucleic acid is a substrate for a polymerase, and/or themodified nucleic acid is a substrate for a sequencing reaction.

The nucleotide analogs provided herein are functional terminators, e.g.,they act to terminate synthesis of a nucleic acid (e.g., similar to adideoxynucleotide as used in Sanger sequencing) while also comprising areactive group for further chemical processing. Accordingly, asdescribed herein, in some embodiments, the methods further compriseterminating polymerization with the nucleotide analog.

Related methods provide, in some embodiments, a method for sequencing anucleic acid, the method comprising hybridizing a primer to a nucleicacid template to form a hybridized primer/nucleic acid template complex;providing a plurality of nucleotide analogs, each nucleotide analogcomprising an alkyne moiety; reacting the hybridized primer/nucleic acidtemplate complex and the nucleotide analog with a polymerase to add thenucleotide analog to the primer by a polymerase reaction to form anextended product comprising an incorporated nucleotide analog; andreacting the extended product with an azide-containing compound to forma structure comprising a triazole ring. In particular embodiments, thenucleotide analogs are 3′-O-propargyl-dNTP nucleotide analogs and N isselected from the group consisting of A, C, G, T and U. As the triazolering formed by the “click chemistry” reaction does not substantiallyand/or detectably inhibit biological enzyme activity, the nucleicacid-oligonucleotide conjugate provides a useful nucleic acid forfurther manipulation. Thus, in some embodiments the structure comprisinga triazole ring is used in subsequent enzymatic reactions, e.g., apolymerase chain reaction and/or a sequencing reaction. Polymerizationin the presence of nucleotide analogs is performed, in some embodiments,in the presence also of conventional (e.g., non-terminator) nucleotides.Related methods comprise providing conventional nucleotides.

Also provided herein are embodiments of kits. For example, in someembodiments, kits are provided for synthesizing a modified nucleic acid,the kit comprising a nucleotide analog comprising an alkynyl group; anda copper-based catalyst reagent. In some embodiments kits furthercomprise other components that find use in the processing and/ormanipulation of nucleic acids. Thus, in some embodiments kits furthercomprise a polymerase, an adaptor oligonucleotide comprising an azidemoiety, and/or a nucleotide (e.g., a conventional nucleotide). Forexample, some embodiments of the technology relate to kits for producinga NGS sequencing library and/or for obtaining sequence information froma target nucleic acid. For example, some embodiments provide a kitcomprising a nucleotide analog, e.g., for producing a nucleotidefragment ladder according to the methods provided herein. In someembodiments, the nucleotide analog is a 3′-O-blocked nucleotide analog,e.g., a 3′-O-alkynyl nucleotide analog, e.g., a 3′-O-propargylnucleotide analog. In some embodiments, conventional A, C, G, U, and/orT nucleotides are provided in a kit as well as one or more (e.g., 1, 2,3, or 4) A, C, G, U, and/or T nucleotide analogs.

In some embodiments, kits comprise a polymerase (e.g., a naturalpolymerase, a modified polymerase, and/or an engineered polymerase,etc.), e.g., for amplification (e.g., by thermal cycling, isothermalamplification) or for sequencing, etc. In some embodiments, kitscomprise a ligase, e.g., for attaching adaptors to a nucleic acid suchas an amplicon or a ladder fragment or for circularizing anadaptor-amplicon. Some embodiments of kits comprise a copper-basedcatalyst reagent, e.g., for a click chemistry reaction, e.g., to reactan azide and an alkynyl group to form a triazole link. Some kitembodiments provide buffers, salts, reaction vessels, instructions,and/or computer software.

In some embodiments, kits comprise primers and/or adaptors. In someembodiments, the adaptors comprise a chemical modification suitable forattaching the adaptor to the nucleotide analog, e.g., by clickchemistry. For example, in some embodiments, the kit comprises anucleotide analog comprising an alkyne group and an adaptoroligonucleotide comprising an azide (N₃) group. In some embodiments, a“click chemistry” process such as an azide-alkyne cycloaddition is usedto link the adaptor to the fragment via formation of a triazole.

Particular kit embodiments provide a kit for generating a sequencinglibrary, the kit comprising an adaptor oligonucleotide comprising afirst reactive group (e.g., an azide), a 3′-O-blocked nucleotide analog(e.g., a 3′-O-alkynyl nucleotide analog or a 3′-O-propargyl nucleotideanalog, e.g., comprising an alkyne group, e.g., comprising a secondreactive group that forms a chemical bond with the first reactive group,e.g., using click chemistry), a polymerase (e.g., a polymerase forisothermal amplification or thermal cycling), a second adaptoroligonucleotide, one or more compositions comprising a nucleotide or amixture of nucleotides, and a ligase or a copper-based click chemistrycatalyst reagent.

In some embodiments of kits, kits comprise one or more 3′-O-blockednucleotide analog(s) (e.g., one or more 3′-O-alkynyl nucleotideanalog(s) such as one or more 3′-O-propargyl nucleotide analog(s) andone or more adaptor oligonucleotides comprising an azide group (e.g., a5′-azido oligonucleotide, e.g., a 5′-azido-methyl oligonucleotide). Somekit embodiments further provide a 5′-azido-methyl oligonucleotidecomprising a barcode. Some kit embodiments further provide a pluralityof 5′-azido-methyl oligonucleotides comprising a plurality of barcodes(e.g., each 5′-azido-methyl oligonucleotide comprises a barcode that isdistinguishable from one or more other barcodes of one or more other5′-azido-methyl oligonucleotide(s) comprising a different barcode).Further kit embodiments comprise a click chemistry catalytic reagent(e.g., a copper(I) catalytic reagent).

Some kit embodiments comprise one or more standard dNTPs in addition tothe one or more one or more 3′-O-blocked nucleotide analog(s) (e.g., oneor more 3′-O-alkynyl nucleotide analog(s) such as one or more3′-O-propargyl nucleotide analog(s). For instance, some kit embodimentprovide dATP, dCTP, dGTP, and dTTP, either in separate vessels or as amixture with one or more 3′-O-propargyl-dATP, 3′-O-propargyl-dCTP,3′-O-propargyl-dGTP, and/or 3′-O-propargyl-dATP.

Some kit embodiments further comprise a polymerase obtained from,derived from, isolated from, cloned from, etc. a Thermococcus species(e.g., an organism of the taxonomic lineage Archaea; Euryarchaeota;Thermococci; Thermococcales; Thermococcaceae; Thermococcus). In someembodiments, the polymerase is obtained from, derived from, isolatedfrom, cloned from, etc. a Thermococcus species 9° N-7. In someembodiments, the polymerase comprises amino acid substitutions thatprovide for improved incorporation of modified substrates such asmodified dideoxynucleotides, ribonucleotides, and acyclonucleotides. Insome embodiments, the polymerase comprises amino acid substitutions thatprovide for improved incorporation of nucleotide analogs comprisingmodified 3′ functional groups such as the 3′-O-propargyl dNTPs describedherein. In some embodiments the amino acid sequence of the polymerasecomprises one or more amino acid substitutions relative to theThermococcus sp. 9° N-7 wild-type polymerase amino acid sequence, e.g.,a substitution of alanine for the aspartic acid at amino acid position141 (D141A), a substitution of alanine for the glutamic acid at aminoacid position 143 (E143A), a substitution of valine for the tyrosine atamino acid position 409 (Y409V), and/or a substitution of leucine forthe alanine at amino acid position 485 (A485L). In some embodiments, thepolymerase is provided in a heterologous host organism such asEscherichia coil that comprises a cloned Thermococcus sp. 9° N-7polymerase gene, e.g., comprising one or more mutations (e.g., D141A,E143A, Y409V, and/or A485L). In some embodiments, the polymerase is aThermococcus sp. 9° N-7 polymerase sold under the trade name THERMINATOR(e.g., THERMINATOR II) by New England BioLabs (Ipswich, Mass.).

Accordingly, some kit embodiments comprise one or more 3′-O-propargylnucleotide analog(s) (e.g., one or more of 3′-O-propargyl-dATP,3′-O-propargyl-dCTP, 3′-O-propargyl-dGTP, and/or 3′-O-propargyl-dATP), amixture of standard dNTPs (e.g., dATP, dCTP, dGTP, and dTTP), one ormore 5′-azido-methyl oligonucleotide adaptors, a polymerase obtainedfrom, derived from, isolated from, cloned from, etc. a Thermococcusspecies, and a click chemistry catalyst for forming a triazole from anazide group and an alkyl group. In some embodiments, the one or more3′-O-propargyl nucleotide analog(s) (e.g., one or more of3′-O-propargyl-dATP, 3′-O-propargyl-dCTP, 3′-O-propargyl-dGTP, and/or3′-O-propargyl-dATP) and the mixture of standard dNTPs (e.g., dATP,dCTP, dGTP, and dTTP) are provided together, e.g., the kit comprises asolution comprising the one or more 3′-O-propargyl nucleotide analog(s)(e.g., one or more of 3′-O-propargyl-dATP, 3′-O-propargyl-dCTP,3′-O-propargyl-dGTP, and/or 3′-O-propargyl-dATP) and the mixture ofstandard dNTPs (e.g., dATP, dCTP, dGTP, and dTTP). In some embodiments,the solution comprises the one or more 3′-O-propargyl nucleotideanalog(s) (e.g., one or more of 3′-O-propargyl-dATP,3′-O-propargyl-dCTP, 3′-O-propargyl-dGTP, and/or 3′-O-propargyl-dATP)and the mixture of standard dNTPs (e.g., dATP, dCTP, dGTP, and dTTP) ata ratio of from 1:500 to 500:1 (e.g., 1:500, 1:450, 1:400, 1:350, 1:300,1:250, 1:200, 1:150, 1:100, 1:90, 1:80, 1:70, 1:60, 1:50, 1:40, 1:30,1:20, 1:10, 1:9, 1:8, 1:7, 1:6, 1:5, 1:4, 1:3, 1:2, 2:1, 3:1, 4:1, 5:1,6:1, 7:1, 8:1, 9:1, 10:1, 20:1, 30:1, 40:1, 50:1, 60:1, 70:1, 80:1,90:1, 100:1, 150:1, 200:1, 250:1, 300:1, 350:1, 400:1, 450:1, or 500:1).

Some embodiments of kits further comprise software for processingsequence data, e.g., to extract nucleotide sequence data from the dataproduced by a sequencer; to identify barcodes and target subsequencesfrom the data produced by a sequencer; to align and/or assemblesubsequences from the data produced by a sequencer to produce aconsensus sequence; and/or to align subsequences and/or a consensussequence to a reference sequence.

In some embodiments, provided herein are compositions comprising anucleotide analog having a structure according to:

wherein B is a base (e.g., a purine or a pyrimidine such as a cytosine,guanine, adenine, thymine, or uracil; e.g., a modified purine or amodified pyrimidine) and P comprises a phosphate moiety (e.g., atetraphosphate; a triphosphate; a diphosphate; a monophosphate; a 5′hydroxyl; an alpha thiophosphate (e.g., phosphorothioate orphosphorodithioate), a beta thiophosphate (e.g., phosphorothioate orphosphorodithioate), and/or a gamma thiophosphate (e.g.,phosphorothioate or phosphorodithioate); or an alpha methylphosphonate,a beta methylphosphonate, and/or a gamma methylphosphonate); a nucleicacid; a polymerase; and a nucleotide (e.g., comprising the base B, e.g.,in a number ratio of the nucleotide analog to the nucleotide that is1:1, 1:2, 1:3, 1:4, 1:5, 1:10, 1:15, 1:20, 1:25, 1:30, 1:50, 1:75,1:100, 1:200, 1:300, 1:400, 1:500, 1:600, 1:700, 1:800, 1:900, 1:1000,1:5000, or 1:10000).

Also provided are embodiments of compositions comprising a nucleic acid(e.g., produced by a polymerase), wherein the nucleic acid comprises anucleotide analog (e.g., at its 3′ end) having a structure according to:

wherein B is a base (e.g., a purine or a pyrimidine such as a cytosine,guanine, adenine, thymine, or uracil; e.g., a modified purine or amodified pyrimidine) and P comprises a phosphate moiety (e.g., atetraphosphate; a triphosphate; a diphosphate; a monophosphate; a 5′hydroxyl; an alpha thiophosphate (e.g., phosphorothioate orphosphorodithioate), a beta thiophosphate (e.g., phosphorothioate orphosphorodithioate), and/or a gamma thiophosphate (e.g.,phosphorothioate or phosphorodithioate); or an alpha methylphosphonate,a beta methylphosphonate, and/or a gamma methylphosphonate); a secondnucleic acid (e.g., comprising an azide, e.g., at its 5′ end), a labelcomprising an azide, a tag comprising an azide, a solid supportcomprising an azide, a nucleotide comprising an azide, a biotincomprising an azide, or a protein comprising an azide; a copper (e.g.,copper-based) catalyst reagent; a nucleic acid comprising a triazole(e.g., a 1′, 4′ substituted triazole); and/or a structure such as:

where R₁ and R₂ are individually any chemical structure or chemicalmoiety (and may be the same or different chemical structures or chemicalmoieties in different structures) and B₁ and B₂ individually indicatethe base of the nucleotide (e.g., adenine, guanine, thymine, cytosine,or a natural or synthetic nucleobase, e.g., a modified purine such ashypoxanthine, xanthine, 7-methylguanine; a modified pyrimidine such as5,6-dihydrouracil, 5-methylcytosine, 5-hydroxymethylcytosine; etc.); anadaptor oligonucleotide, an adaptor oligonucleotide comprising abarcode, or a barcode oligonucleotide.

In another aspect, the technology provides a method for synthesizing amodified nucleic acid, the method comprising providing a nucleotideanalog comprising an alkyne group, e.g., a nucleotide having a structureaccording to:

wherein B is a base (e.g., cytosine, guanine, adenine, thymine, oruracil) and P comprises a triphosphate moiety; linking a nucleic acid tothe nucleotide analog to produce a modified nucleic acid comprising thenucleotide analog; providing a template; providing a primer; providing anucleotide; providing a polymerase (e.g., to catalyze the linking of thenucleic acid to the nucleotide analog); terminating polymerization withthe nucleotide analog; reacting the modified nucleic acid with an azidemoiety (e.g., with a second nucleic acid comprising an azide moiety atits 5′ end, a label comprising an azide, a tag comprising an azide, asolid support comprising an azide, a nucleotide comprising an azide, aprotein comprising an azide, an adaptor oligonucleotide comprising anazide moiety, an adaptor oligonucleotide comprising a barcode andcomprising an azide moiety, or a barcode oligonucleotide comprising anazide moiety), e.g., to produce a nucleic acid-oligonucleotide conjugate(e.g., that is a substrate for a biological enzyme such as a polymeraseand/or to provide a substrate for a sequencing reaction); and/orreacting the modified nucleic acid with an azide moiety and acopper-based catalyst reagent.

In some embodiments are provided a method for sequencing a nucleic acid,the method comprising hybridizing a primer to a nucleic acid template toform a hybridized primer/nucleic acid template complex; providing aplurality of nucleotide analogs (e.g., 3′-O-propargyl-dNTP nucleotideanalogs wherein N is selected from the group consisting of A, C, G, T,and U), each nucleotide analog comprising an alkyne moiety; providingconventional nucleotides; reacting the hybridized primer/nucleic acidtemplate complex and the nucleotide analog with a polymerase to add thenucleotide analog to the primer by a polymerase reaction to form anextended product comprising an incorporated nucleotide analog; andreacting the extended product with an azide-containing compound to forma structure comprising a triazole ring (e.g., that is used in subsequentenzymatic reactions such as a polymerase chain reaction).

In some embodiments are provided a kit for synthesizing a modifiednucleic acid, the kit comprising a nucleotide analog comprising analkynyl group; a copper-based catalyst reagent; a polymerase; an adaptoroligonucleotide comprising an azide moiety; and a conventionalnucleotide.

Particular embodiments are related to generating a nucleic acid fragmentladder using a polymerase reaction comprising standard dNTPs and3′-O-propargyl-dNTPs at a molar ratio of from 1:500 to 500:1 (standarddNTPs to 3′-O-propargyl-dNTPs). Terminated nucleic acid fragmentsproduced by methods described herein comprise a prop argyl group ontheir 3′ ends. Further embodiments are related to attaching an adaptorto the 3′ ends of the nucleic acid fragments using chemical conjugation.For example, in some embodiments a 5′-azido-modified oligonucleotide(e.g., a 5′-azido-methyl-modified oligonucleotide) is conjugated to the3′-propargyl-terminated nucleic acid fragments by click chemistry (e.g.,in a reaction catalyzed by a copper (e.g., copper (I)) reagent). In someembodiments, a target region is first amplified (e.g., by PCR) toproduce a target amplicon for sequencing. In some embodiments,amplifying the target region comprises amplification of the targetregion for 5 to 15 cycles (e.g., a “limited cycle” or “low-cycle”amplification).

Further embodiments provide that the target amplicon comprises a tag(e.g., comprises a barcode sequence), e.g., the target amplicon is anidentifiable amplicon. In some embodiments, a primer used in theamplification of the target region comprises a tag (e.g., comprising abarcode sequence) that is subsequently incorporated into the targetamplicon (e.g., in a “copy and tag” reaction) to produce an identifiableamplicon. In some embodiments, an adaptor comprising the tag (e.g.,comprising a barcode sequence) is ligated to the target amplicon afteramplification (e.g., in a ligase reaction) to produce an identifiableadaptor-amplicon. In some embodiments, the primer used to produce anidentifiable amplicon in a copy and tag reaction comprises a 3′ regioncomprising a target-specific priming sequence and a 5′ region comprisingtwo different universal sequences (e.g., a universal sequence A and auniversal sequence B) flanking a degenerate sequence. In someembodiments, an adaptor ligated to an amplicon to produce anidentifiable adaptor-amplicon is a double stranded adaptor, e.g.,comprising one strand comprising a degenerate sequence (e.g., comprising8 to12 bases) flanked on both the 5′ end and the 3′ end by two differentuniversal sequences (e.g., a universal sequence A and a universalsequence B) and a second strand comprising a universal sequence C (e.g.,at the 5′ end) and a sequence (e.g., at the 3′ end) that iscomplementary to the universal sequence B and that has an additional Tat the 3′-terminal position.

Embodiments of the technology provide for the generation of nucleic acidladder fragments from an adaptor-amplicon, e.g., to provide a sequencinglibrary for NGS. In particular, the technology provides for thegeneration of a 3′-O-propargyl-dN terminated nucleic acid ladder fornucleic acid sequencing (e.g., NGS), e.g., by using a polymerasereaction comprising standard dNTPs and 3′-O-propargyl-dNTPs at a molarratio of from 1:500 to 500:1 (standard dNTPs to 3′-O-propargyl-dNTPs).Then, in some embodiments, the technology provides for attaching anadaptor to the 3′ ends of the nucleic acid fragments using chemicalconjugation. For example, in some embodiments, a 5′-azido-modifiedoligonucleotide (e.g., a 5′-azido-methyl-modified oligonucleotide) isconjugated to the 3′-propargyl-terminated nucleic acid fragments byclick chemistry (e.g., in a reaction catalyzed by a copper (e.g., copper(I)) reagent).

Some embodiments of the technology provide a composition for use as anext-generation sequencing library to obtain a sequence of a targetnucleic acid, the composition comprising n nucleic acids (e.g., anucleic acid fragment library), wherein each of the n nucleic acidscomprises a 3′-O-blocked nucleotide analog (e.g., a 3′-O-alkynylnucleotide analog such as a 3′-O-propargyl nucleotide analog). In someembodiments, each nucleic acid of the n nucleic acids comprises anucleotide subsequence of a target nucleotide sequence.

In particular, embodiments provide a composition comprising n nucleicacids, wherein each of the n nucleic acids is terminated by a3′-O-blocked nucleotide analog (e.g., a 3′-O-alkynyl nucleotide analogsuch as a 3′-O-propargyl nucleotide analog). Further embodiments providea composition comprising n nucleic acids (e.g., a nucleic acid fragmentlibrary), wherein each of the n nucleic acids comprises a 3′-O-blockednucleotide analog (e.g., a 3′-O-alkynyl nucleotide analog such as a3′-O-propargyl nucleotide analog) and each of the n nucleic acids isconjugated (e.g., linked) to an oligonucleotide adaptor by a triazolelinkage (e.g., a linkage formed from a chemical conjugation of a propargyl group and an azido group, e.g., by a click chemistry reaction).For example, some embodiments provide a composition comprising n nucleicacids (e.g., a nucleic acid fragment library), wherein each of the nnucleic acids comprises a 3′-O-propargyl nucleotide analog (e.g., a3′-O-propargyl-dA, 3′-O-propargyl-dC, 3′-O-propargyl-dG, and/or a3′-O-propargyl-dT) conjugated (e.g., linked) to an oligonucleotideadaptor by a triazole linkage (e.g., a linkage formed from a chemicalconjugation of a propargyl group and an azido group, e.g., by a clickchemistry reaction).

In some embodiments, the composition for use as a next-generationsequencing library to obtain a sequence of a target nucleic acid isproduced by a method comprising synthesizing a n nucleic acids (e.g., anucleic acid fragment library) using a mixture of dNTPs and one or more3′-O-blocked nucleotide analog(s) (e.g., one or more 3′-O-alkynylnucleotide analog(s) such as one or more 3′-O-propargyl nucleotideanalog(s)), e.g., at a molar ratio of from 1:500 to 500:1 (e.g., 1:500,1:450, 1:400, 1:350, 1:300, 1:250, 1:200, 1:150, 1:100, 1:90, 1:80,1:70, 1:60, 1:50, 1:40, 1:30, 1:20, 1:10, 1:9, 1:8, 1:7, 1:6, 1:5, 1:4,1:3, 1:2, 2:1, 3:1, 4:1, 5:1, 6:1, 7:1, 8:1, 9:1, 10:1, 20:1, 30:1,40:1, 50:1, 60:1, 70:1, 80:1, 90:1, 100:1, 150:1, 200:1, 250:1, 300:1,350:1, 400:1, 450:1, or 500:1). In some embodiments, the composition isproduced using a polymerase obtained from, derived from, isolated from,cloned from, etc. a Thermococcus species (e.g., an organism of thetaxonomic lineage Archaea; Euryarchaeota; Thermococci; Thermococcales;Thermococcaceae; Thermococcus). In some embodiments, the polymerase isobtained from, derived from, isolated from, cloned from, etc. aThermococcus species 9° N-7. In some embodiments, the polymerasecomprises amino acid substitutions that provide for improvedincorporation of modified substrates such as modifieddideoxynucleotides, ribonucleotides, and acyclonucleotides. In someembodiments, the polymerase comprises amino acid substitutions thatprovide for improved incorporation of nucleotide analogs comprisingmodified 3′ functional groups such as the 3′-O-propargyl dNTPs describedherein. In some embodiments the amino acid sequence of the polymerasecomprises one or more amino acid substitutions relative to theThermococcus sp. 9° N-7 wild-type polymerase amino acid sequence, e.g.,a substitution of alanine for the aspartic acid at amino acid position141 (D141A), a substitution of alanine for the glutamic acid at aminoacid position 143 (E143A), a substitution of valine for the tyrosine atamino acid position 409 (Y409V), and/or a substitution of leucine forthe alanine at amino acid position 485 (A485L). In some embodiments, thepolymerase is provided in a heterologous host organism such asEscherichia coil that comprises a cloned Thermococcus sp. 9° N-7polymerase gene, e.g., comprising one or more mutations (e.g., D141A,E143A, Y409V, and/or A485L). In some embodiments, the polymerase is aThermococcus sp. 9° N-7 polymerase sold under the trade name THERMINATOR(e.g., THERMINATOR II) by New England BioLabs (Ipswich, Mass.).

Accordingly, the technology relates to reaction mixtures comprising atarget nucleic acid, a mixture of dNTPs and one or more 3′-O-blockednucleotide analog(s) (e.g., one or more 3′-O-alkynyl nucleotideanalog(s) such as one or more 3′-O-propargyl nucleotide analog(s)),e.g., at a molar ratio of from 1:500 to 500:1 (e.g., 1:500, 1:450,1:400, 1:350, 1:300, 1:250, 1:200, 1:150, 1:100, 1:90, 1:80, 1:70, 1:60,1:50, 1:40, 1:30, 1:20, 1:10, 1:9, 1:8, 1:7, 1:6, 1:5, 1:4, 1:3, 1:2,2:1, 3:1, 4:1, 5:1, 6:1, 7:1, 8:1, 9:1, 10:1, 20:1, 30:1, 40:1, 50:1,60:1, 70:1, 80:1, 90:1, 100:1, 150:1, 200:1, 250:1, 300:1, 350:1, 400:1,450:1, or 500:1), and a polymerase for synthesizing a nucleic acid usingthe dNTPs and one or more 3′-O-blocked nucleotide analog(s) (e.g., apolymerase obtained from, derived from, isolated from, cloned from, etc.a Thermococcus species). In some embodiments, the target nucleic acid isan amplicon. In some embodiments, the target nucleic acid comprises abarcode. In some embodiments, the target nucleic acid is an ampliconcomprising a barcode. In some embodiments, the target nucleic acid is anamplicon ligated to an adaptor comprising a barcode. Some embodimentsprovide reaction mixtures that comprises a plurality of target nucleicacids, each target nucleic acid comprising a barcode associated with anidentifiable characteristic of the target nucleic acid.

Some embodiments provide a reaction mixture composition comprising atemplate (e.g., a circular template, e.g., comprising a universalnucleotide sequence and/or a barcode nucleotide sequence) comprising asubsequence of a target nucleic acid, a polymerase, one or morefragments of a ladder fragment library, and a 3′-O-blocked nucleotideanalog.

Some embodiments provide a reaction mixture composition comprising alibrary of nucleic acids, the library of nucleic acids comprisingoverlapping short nucleotide sequences tiled over a target nucleic acid(e.g., the overlapping short nucleotide sequences cover a region of thetarget nucleic acid comprising 100 bases, 200 bases, 300 bases, 400bases, 500 bases, 600 bases, 700 bases, 800 bases, 900 bases, 1000bases, or more than 1000 bases, e.g., 2000 bases, 2500 bases, 3000bases, 3500 bases, 4000 bases, 4500 bases, 5000 bases, or more than 5000bases) and offset from one another by 1-20, 1-10, or 1-5 bases (e.g., 1base) and each nucleic acid of the library comprising less than 100bases, less than 90 bases, less than 80 bases, less than 70 bases, lessthan 60 bases, less than 50 bases, less than 45 bases, less than 40bases, less than 35 bases, or less than 30 bases.

Additional embodiments are provided below and as variations of thetechnology described as understood by a person having ordinary skill inthe art.

BRIEF DESCRIPTION OF THE DRAWINGS

These and other features, aspects, and advantages of the presenttechnology will become better understood with regard to the followingdrawings:

FIG.1 is a schematic showing a polymerase extension reaction using3′-O-propargyl-dGTP. The polymerase extension halts after theincorporation of 3′-O-propargyl-dGTP, producing product 1. A5′-azide-modified DNA fragment is chemically ligated to product 1 usingclick chemistry producing product 2. The covalent linkage created by theformation of the triazole ring mimics that of the natural DNA backbonephosphodiester linkage. Product 2 is used subsequently in enzymaticreactions (e.g., PCR).

FIG. 2 is a schematic showing a polymerase extension reaction using acombination of dNTPs and 3′-O-propargyl-dNTPs. DNA ladder fragments (n+1fragments) are generated with each of the fragments' 3′-ends having analkyne group. These DNA ladder fragments are ligated to a5′-azide-modified DNA molecule, which has a “universal” sequence and/ora barcode sequence and/or a primer binding site, via click chemistry.The ligated DNA fragments are subsequently treated and used as input innext generation sequencing (NGS) processes. These DNA fragments with then+1 characteristic produce DNA sequencing data by assembling shortreads, thereby significantly decreasing the NGS run time.

FIG. 3A-3G show analytical data for3′-O-propargyl-2′-deoxycytidine-5′-triphosphate (3′-O-propargyl-dCTP)synthesized as described herein. FIG. 3A shows ¹H NMR data for3′-O-propargyl-dCTP. FIG. 3B shows an enlarged portion of the ¹H NMRdata for 3′-O-propargyl-dCTP shown in FIG. 3A. FIG. 3C shows an enlargedportion of the ¹H NMR data for 3′-O-propargyl-dCTP shown in FIG. 3A.FIG. 3D shows ³¹P NMR data for 3′-O-propargyl-dCTP. FIG. 3E shows anenlarged portion of the ³¹P NMR data for 3′-O-propargyl-dCTP shown inFIG.3D. FIG. 3F shows anion-exchange HPLC data for 3′-O-propargyl-dCTP.FIG. 3G shows high-resolution mass spectrum data for3′-O-propargyl-dCTP.

FIG. 4A-4G show analytical data for3′-O-propargyl-2′-deoxythymidine-5′-triphosphate (3′-O-propargyl-dTTP)synthesized as described herein. FIG. 4A shows ¹H NMR data for3′-O-propargyl-dTTP. FIG. 4B shows an enlarged portion of the ¹H NMRdata for 3′-O-propargyl-dTTP shown in FIG. 4A. FIG. 4C shows an enlargedportion of the ¹H NMR data for 3′-O-propargyl-dTTP shown in FIG. 4A.FIG. 4D shows ³¹P NMR data for 3′-O-propargyl-dTTP. FIG. 4E shows anenlarged portion of the ³¹P NMR data for 3′-O-propargyl-dTTP shown inFIG. 4D. FIG. 4F shows anion-exchange HPLC data for 3′-O-propargyl-dTTP.FIG. 4G shows high-resolution mass spectrum data for3′-O-propargyl-dTTP.

FIG. 5A-5G show analytical data for3′-O-propargyl-2′-deoxyadenosine-5′-triphosphate (3′-O-propargyl-dATP)synthesized as described herein. FIG. 5A shows ¹H NMR data for3′-O-propargyl-dATP. FIG. 5B shows an enlarged portion of the ¹H NMRdata for 3′-O-propargyl-dATP shown in FIG. 5A. FIG. 5C shows an enlargedportion of the ¹H NMR data for 3′-O-propargyl-dATP shown in FIG. 5A.FIG. 5D shows ³¹P NMR data for 3′-O-propargyl-dATP. FIG. 5E shows anenlarged portion of the ³¹P NMR data for 3′-O-propargyl-dATP shown inFIG. 5D. FIG. 5F shows anion-exchange HPLC data for 3′-O-propargyl-dATP.FIG. 5G shows high-resolution mass spectrum data for3′-O-propargyl-dATP.

FIG. 6A-6G show analytical data for3′-O-propargyl-2′-deoxyguanosine-5′-riphosphate (3′-O-propargyl-dGTP)synthesized as described herein. FIG. 6A shows ¹H NMR data for3′-O-propargyl-dGTP. FIG. 6B shows an enlarged portion of the ¹H NMRdata for 3′-O-propargyl-dGTP shown in FIG. 6A. FIG. 6C shows an enlargedportion of the ¹H NMR data for 3′-O-propargyl-dGTP shown in FIG. 6A.FIG. 6D shows ³¹P NMR data for 3′-O-propargyl-dGTP. FIG. 6E shows anenlarged portion of the ³¹P NMR data for 3′-O-propargyl-dGTP shown inFIG. 6D. FIG. 6F shows anion-exchange HPLC data for 3′-O-propargyl-dGTP.FIG. 6G shows high-resolution mass spectrum data for3′-O-propargyl-dGTP.

It is to be understood that the figures are not necessarily drawn toscale, nor are the objects in the figures necessarily drawn to scale inrelationship to one another. The figures are depictions that areintended to bring clarity and understanding to various embodiments ofapparatuses, systems, and methods disclosed herein. Wherever possible,the same reference numbers will be used throughout the drawings to referto the same or like parts. Moreover, it should be appreciated that thedrawings are not intended to limit the scope of the present teachings inany way.

DETAILED DESCRIPTION

Provided herein is technology relating to the manipulation and detectionof nucleic acids, including but not limited to compositions, methods,systems, and kits related to nucleotides comprising a chemicallyreactive linking moiety. In particular embodiments, the technologyprovides nucleotide analogs comprising a base (e.g., adenine, guanine,cytosine, thymine, or uracil), a sugar (e.g., a ribose or deoxyribose),and an alkyne chemical moiety, e.g., attached to the 3′ oxygen of thesugar (e.g., the 3′ oxygen of the deoxyribose or the 3′ oxygen of theribose). The nucleotide analogs (e.g., a 3′-alkynyl nucleotide analog,e.g., a 3′-O-propargyl nucleotide analog such as a 3′-O-propargyl dNTPor a 3′-O-propargyl NTP) find use in embodiments of the technology tointroduce a particular chemical moiety (e.g., an alkyne) at the end(e.g., the 3′ end) of a nucleic acid (e.g., a DNA or RNA) by apolymerase extension reaction, and, consequently, to produce a nucleicacid modification that does not exist in natural biological systems.Chemical ligation between the polymerase extension products andappropriate conjugation partners (e.g., azide modified entities) isachieved with high efficiency and specificity using click chemistry.Embodiments of the functional nucleotide terminators provided herein areused to produce nucleic acids that are useful for various molecularbiology, biochemical, and biotechnology applications.

The technology provides several advantages over current technologies.For instance, the technology provides sequence data that is better(e.g., higher quality, longer reads, fewer errors, etc.) or comparablethan existing technologies in a shorter run time than existingtechnologies. Moreover, the technology provides sequence data reads thatcan be stitched together to provide a longer read of high quality.

The section headings used herein are for organizational purposes onlyand are not to be construed as limiting the described subject matter inany way. In this detailed description of the various embodiments, forpurposes of explanation, numerous specific details are set forth toprovide a thorough understanding of the embodiments disclosed. Oneskilled in the art will appreciate, however, that these variousembodiments may be practiced with or without these specific details. Inother instances, structures and devices are shown in block diagram form.Furthermore, one skilled in the art can readily appreciate that thespecific sequences in which methods are presented and performed areillustrative and it is contemplated that the sequences can be varied andstill remain within the spirit and scope of the various embodimentsdisclosed herein.

All literature and similar materials cited in this application,including but not limited to, patents, patent applications, articles,books, treatises, and internet web pages are expressly incorporated byreference in their entirety for any purpose. Unless defined otherwise,all technical and scientific terms used herein have the same meaning asis commonly understood by one of ordinary skill in the art to which thevarious embodiments described herein belongs. When definitions of termsin incorporated references appear to differ from the definitionsprovided in the present teachings, the definition provided in thepresent teachings shall control.

Definitions

To facilitate an understanding of the present technology, a number ofterms and phrases are defined below. Additional definitions are setforth throughout the detailed description.

Throughout the specification and claims, the following terms take themeanings explicitly associated herein, unless the context clearlydictates otherwise. The phrase “in one embodiment” as used herein doesnot necessarily refer to the same embodiment, though it may.Furthermore, the phrase “in another embodiment” as used herein does notnecessarily refer to a different embodiment, although it may. Thus, asdescribed below, various embodiments of the invention may be readilycombined, without departing from the scope or spirit of the invention.

In addition, as used herein, the term “or” is an inclusive “or” operatorand is equivalent to the term “and/or” unless the context clearlydictates otherwise. The term “based on” is not exclusive and allows forbeing based on additional factors not described, unless the contextclearly dictates otherwise. In addition, throughout the specification,the meaning of “a”, “an”, and “the” include plural references. Themeaning of “in” includes “in” and “on.”

As used herein, a “nucleotide” comprises a “base” (alternatively, a“nucleobase” or “nitrogenous base”), a “sugar” (in particular, afive-carbon sugar, e.g., ribose or 2-deoxyribose), and a “phosphatemoiety” of one or more phosphate groups (e.g., a monophosphate, adiphosphate, a triphosphate, a tetraphosphate, etc. consisting of one,two, three, four or more linked phosphates, respectively). Without thephosphate moiety, the nucleobase and the sugar compose a “nucleoside”. Anucleotide can thus also be called a nucleoside monophosphate or anucleoside diphosphate or a nucleoside triphosphate, depending on thenumber of phosphate groups attached. The phosphate moiety is usuallyattached to the 5-carbon of the sugar, though some nucleotides comprisephosphate moieties attached to the 2-carbon or the 3-carbon of thesugar. Nucleotides contain either a purine (e.g., in the nucleotidesadenine and guanine) or a pyrimidine base (e.g., in the nucleotidescytosine, thymine, and uracil). Some nucleotides contain non-naturalbases. Ribonucleotides are nucleotides in which the sugar is ribose.Deoxyribonucleotides are nucleotides in which the sugar is deoxyribose.

As used herein, a “nucleic acid” shall mean any nucleic acid molecule,including, without limitation, DNA, RNA, and hybrids thereof. Thenucleic acid bases that form nucleic acid molecules can be the bases A,C, G, T and U, as well as derivatives thereof.

Derivatives of these bases are well known in the art. The term should beunderstood to include, as equivalents, analogs of either DNA or RNA madefrom nucleotide analogs. The term as used herein also encompasses cDNAthat is complementary DNA produced from an RNA template, for example bythe action of a reverse transcriptase. It is well known that DNA(deoxyribonucleic acid) is a chain of nucleotides consisting of 4 typesof nucleotides—A (adenine), T (thymine), C (cytosine), and G(guanine)—and that RNA (ribonucleic acid) is a chain of nucleotidesconsisting of 4 types of nucleotides—A, U (uracil), G, and C. It is alsoknown that all of these 5 types of nucleotides specifically bind to oneanother in combinations called complementary base pairing. That is,adenine (A) pairs with thymine (T) (in the case of RNA, however, adenine(A) pairs with uracil (U)) and cytosine (C) pairs with guanine (G), sothat each of these base pairs forms a double strand. As used herein,“nucleic acid sequencing data”, “nucleic acid sequencing information”,“nucleic acid sequence”, “genomic sequence”, “genetic sequence”,“fragment sequence”, or “nucleic acid sequencing read” denotes anyinformation or data that is indicative of the order of the nucleotidebases (e.g., adenine, guanine, cytosine, and thymine/uracil) in amolecule (e.g., a whole genome, a whole transcriptome, an exome,oligonucleotide, polynucleotide, fragment, etc.) of DNA or RNA. Itshould be understood that the present teachings contemplate sequenceinformation obtained using all available varieties of techniques,platforms or technologies, including, but not limited to: capillaryelectrophoresis, microarrays, ligation-based systems, polymerase-basedsystems, hybridization-based systems, direct or indirect nucleotideidentification systems, pyrosequencing, ion- or pH-based detectionsystems, electronic signature-based systems, pore-based (e.g.,nanopore), visualization-based systems, etc.

Reference to a base, a nucleotide, or to another molecule may be in thesingular or plural. That is, “a base” may refer to a single molecule ofthat base or to a plurality of the base, e.g., in a solution.

A “polynucleotide”, “nucleic acid”, or “oligonucleotide” refers to alinear polymer of nucleosides (including deoxyribonucleosides,ribonucleosides, or analogs thereof) joined by internucleosidiclinkages. Typically, a polynucleotide comprises at least threenucleosides. Usually oligonucleotides range in size from a few monomericunits, e.g. 3 to 4, to several hundreds of monomeric units. Whenever apolynucleotide such as an oligonucleotide is represented by a sequenceof letters, such as “ATGCCTG”, it will be understood that thenucleotides are in 5′ to 3′ order from left to right and that “A”denotes deoxyadenosine, “C” denotes deoxycytidine, “G” denotesdeoxyguanosine, and “T” denotes thymidine, unless otherwise noted. Theletters A, C, G, and T may be used to refer to the bases themselves, tonucleosides, or to nucleotides comprising the bases, as is standard inthe art.

As used herein, the phrase “dNTP” means deoxynucleotidetriphosphate,where the nucleotide comprises a nucleotide base, such as A, T, C, G orU.

The term “monomer” as used herein means any compound that can beincorporated into a growing molecular chain by a given polymerase. Suchmonomers include, without limitation, naturally occurring nucleotides(e.g., ATP, GTP, TTP, UTP, CTP, dATP, dGTP, dTTP, dUTP, dCTP, syntheticanalogs), precursors for each nucleotide, non-naturally occurringnucleotides and their precursors or any other molecule that can beincorporated into a growing polymer chain by a given polymerase.

As used herein, “complementary” generally refers to specific nucleotideduplexing to form canonical Watson-Crick base pairs, as is understood bythose skilled in the art. However, complementary also includesbase-pairing of nucleotide analogs that are capable of universalbase-pairing with A, T, G or C nucleotides and locked nucleic acids thatenhance the thermal stability of duplexes. One skilled in the art willrecognize that hybridization stringency is a determinant in the degreeof match or mismatch in the duplex formed by hybridization.

As used herein, “moiety” refers to one of two or more parts into whichsomething may be divided, such as, for example, the various parts of atether, a molecule, or a probe.

As used herein, a “linker” is a molecule or moiety that joins twomolecules or moieties and/or provides spacing between the two moleculesor moieties such that they are able to function in their intendedmanner. For example, a linker can comprise a diamine hydrocarbon chainthat is covalently bound through a reactive group on one end to anoligonucleotide analog molecule and through a reactive group on anotherend to a solid support, such as, for example, a bead surface. Couplingof linkers to nucleotides and substrate constructs of interest can beaccomplished through the use of coupling reagents that are known in theart (see, e.g., Efimov et al., Nucleic Acids Res. 27: 4416-4426, 1999).Methods of derivatizing and coupling organic molecules are well known inthe arts of organic and bioorganic chemistry. A linker may also becleavable (e.g., photocleavable) or reversible.

A “polymerase” is an enzyme generally for joining 3′-OH, 5′-triphosphatenucleotides, oligomers, and their analogs. Polymerases include, but arenot limited to, DNA-dependent DNA polymerases, DNA-dependent RNApolymerases, RNA-dependent DNA polymerases, RNA-dependent RNApolymerases, T7 DNA polymerase, T3 DNA polymerase, T4 DNA polymerase, T7RNA polymerase, T3 RNA polymerase, SP6 RNA polymerase, DNA polymerase 1,Klenow fragment, Thermophilus aquaticus DNA polymerase, Tth DNApolymerase, Vent DNA polymerase (New England Biolabs), Deep Vent DNApolymerase (New England Biolabs), Bst DNA Polymerase Large Fragment,Stoeffel Fragment, 9° N DNA Polymerase, Pfu DNA Polymerase, Tfl DNAPolymerase, RepliPHI Phi29 Polymerase, Tli DNA polymerase, eukaryoticDNA polymerase beta, telomerase, Therminator polymerase (New EnglandBiolabs) (e.g., Therminator I, Therminator II, and other variants), KODHiFi. DNA polymerase (Novagen), KOD1 DNA polymerase, Q-beta replicase,terminal transferase, AMV reverse transcriptase, M-MLV reversetranscriptase, Phi6 reverse transcriptase, HIV-1 reverse transcriptase,novel polymerases discovered by bioprospecting, and polymerases cited inU.S. Pat. Appl. Pub. No. 2007/0048748 and in U.S. Pat. Nos. 6,329,178;6,602,695; and 6,395,524. These polymerases include wild-type, mutantisoforms, and genetically engineered variants such as exo⁻ polymerasesand other mutants, e.g., that tolerate modified (e.g., labeled)nucleotides and incorporate them into a strand of nucleic acid.

The term “primer” refers to an oligonucleotide, whether occurringnaturally as in a purified restriction digest or produced synthetically,that is capable of acting as a point of initiation of synthesis whenplaced under conditions in which synthesis of a primer extension productthat is complementary to a nucleic acid strand is induced, (e.g., in thepresence of nucleotides and an inducing agent such as a polymerase andat a suitable temperature and pH). The primer is preferably singlestranded for maximum efficiency in amplification, but may alternativelybe double stranded. If double stranded, the primer is first treated toseparate its strands before being used to prepare extension products.Preferably, the primer is an oligodeoxyribonucleotide. The primer mustbe sufficiently long to prime the synthesis of extension products in thepresence of the inducing agent. The exact lengths of the primers dependson many factors including temperature, source of primer, and the use ofthe method.

As used herein, an “adaptor” is an oligonucleotide that is linked or isdesigned to be linked to a nucleic acid to introduce the nucleic acidinto a sequencing workflow. An adaptor may be single-stranded ordouble-stranded (e.g., a double-stranded DNA or a single-stranded DNA).As used herein, the term “adaptor” refers to the adaptor nucleic in astate that is not linked to another nucleic acid and in a state that islinked to a nucleic acid.

At least a portion of the adaptor comprises a known sequence. Forexample, some embodiments of adaptors comprise a primer binding sequencefor amplification of the nucleic acid and/or for binding of a sequencingprimer. Some adaptors comprise a sequence for hybridization of acomplementary capture probe. Some adaptors comprise a chemical or othermoiety (e.g., a biotin moiety) for capture and/or immobilization to asolid support (e.g., comprising an avidin moiety). Some embodiments ofadaptors comprise a marker, index, barcode, tag, or other sequence bywhich the adaptor and a nucleic acid to which it is linked areidentifable.

Some adaptors comprise a universal sequence. A universal sequence is asequence shared by a plurality of adaptors that may otherwise havedifferent sequences outside of the universal sequence. For example, auniversal sequence provides a common primer binding site for acollection of nucleic acids from different target nucleic acids, e.g.,that may comprise different barcodes.

Some embodiments of adaptors comprise a defined but unknown sequence.For example, some embodiments of adaptors comprise a degenerate sequenceof a defined number of bases (e.g., a 1- to 20-base degeneratesequence). Such a sequence is defined even if each individual sequenceis not known—such a sequence may nevertheless serve as an index,barcode, tag, etc. marking nucleic acid fragments from, e.g., the sametarget nucleic acid.

Some adaptors comprise a blunt end and some adaptors comprise an endwith an overhang of one or more bases.

In particular embodiments provided herein, an adaptor comprises an azidomoiety, e.g., the adaptor comprises an azido (e.g., an azido-methyl)moiety on its 5′ end. Thus, some embodiments are related to adaptorsthat are or that comprise a 5′-azido-modified oligonucleotide or a 5′-azido-methyl-modified oligonucleotide.

In some embodiments, a unique index (a “marker” in some embodiments) isused to associate a fragment with the template nucleic acid from whichit was produced. In some embodiments, a unique index is a uniquesequence of synthetic nucleotides or a unique sequence of naturalnucleotides that allows for easy identification of the target nucleicacid within a complicated collection of oligonucleotides (e.g.,fragments) containing various sequences. In certain embodiments, uniqueindex identifiers are attached to nucleic acid fragments prior toattaching adaptor sequences. In some embodiments, unique indexidentifiers are contained within adaptor sequences such that the uniquesequence is contained in the sequencing reads. This ensures thathomologous fragments can be detected based upon the unique indices thatare attached to each fragment, thus further providing for unambiguousreconstruction of a consensus sequence. Homologous fragments may occurfor example by chance due to genomic repeats, two fragments originatingfrom homologous chromosomes, or fragments originating from overlappinglocations on the same chromosome. Homologous fragments may also arisefrom closely related sequences (e.g., closely related gene familymembers, paralogs, orthologs, ohnologs, xenologs, and/or pseudogenes).Such fragments may be discarded to ensure that long fragment assemblycan be computed unambiguously. The markers may be attached as describedabove for the adaptor sequences. The indices (e.g., markers) may beincluded in the adaptor sequences.

In some embodiments, the unique index (e.g., index identifier, tag,marker, etc.) is a “barcode”. As used herein, the term “barcode” refersto a known nucleic acid sequence that allows some feature of a nucleicacid with which the barcode is associated to be identified. In someembodiments, the feature of the nucleic acid to be identified is thesample or source from which the nucleic acid is derived. In someembodiments, barcodes are at least 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,14, 15, or more nucleotides in length. In some embodiments, barcodes areshorter than 10, 9, 8, 7, 6, 5, or 4 nucleotides in length. In someembodiments, barcodes associated with some nucleic acids are of adifferent length than barcodes associated with other nucleic acids. Ingeneral, barcodes are of sufficient length and comprise sequences thatare sufficiently different to allow the identification of samples basedon barcodes with which they are associated. In some embodiments, abarcode and the sample source with which it is associated can beidentified accurately after the mutation, insertion, or deletion of oneor more nucleotides in the barcode sequence, such as the mutation,insertion, or deletion of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or morenucleotides. In some embodiments, each barcode in a plurality ofbarcodes differs from every other barcode in the plurality at two ormore nucleotide positions, such as at 2, 3, 4, 5, 6, 7, 8, 9, 10, ormore positions. In some embodiments, one or more adaptors comprise(s) atleast one of a plurality of barcode sequences. In some embodiments,methods of the technology further comprise identifying the sample orsource from which a target nucleic acid is derived based on a barcodesequence to which the target nucleic acid is joined. In someembodiments, methods of the technology further comprise identifying thetarget nucleic acid based on a barcode sequence to which the targetnucleic acid is joined. Some embodiments of the method further compriseidentifying a source or sample of the target nucleotide sequence bydetermining a barcode nucleotide sequence. Some embodiments of themethod further comprise molecular counting applications (e.g., digitalbarcode enumeration and/or binning) to determine expression levels orcopy number status of desired targets. In general, a barcode maycomprise a nucleic acid sequence that when joined to a target nucleicacid serves as an identifier of the sample from which the targetpolynucleotide was derived.

In some embodiments, an oligonucleotide such as a primer, adaptor, etc.comprises a “universal” sequence. A universal sequence is a knownsequence, e.g., for use as a primer or probe binding site using a primeror probe of a known sequence (e.g., complementary to the universalsequence). While a template-specific sequence of a primer, a barcodesequence of a primer, and/or a barcode sequence of an adaptor mightdiffer in embodiments of the technology, e.g., from fragment tofragment, from sample to sample, from source to source, or from regionof interest to region of interest, embodiments of the technology providethat a universal sequence is the same from fragment to fragment, fromsample to sample, from source to source, or from region of interest toregion of interest so that all fragments comprising the universalsequence can be handled and/or treated in a same or similar manner,e.g., amplified, identified, sequenced, isolated, etc., using similarmethods or techniques (e.g., using the same primer or probe).

In particular embodiments, a primer is used comprising a universalsequence (e.g., universal sequence A), a barcode sequence, and atemplate-specific sequence. In particular embodiments, a first adaptorcomprising a universal sequence (e.g., universal sequence B) is used andin particular embodiments, a second adaptor comprising a universalsequence (e.g., universal sequence C) is used. Universal sequence A,universal sequence B, and universal sequence C can be any sequence. Thisnomenclature is used to note that the universal sequence A of a firstnucleic acid (e.g., a fragment) comprising universal sequence A is thesame as the universal sequence A of a second nucleic acid (e.g., afragment) comprising universal sequence A, the universal sequence B of afirst nucleic acid (e.g., a fragment) comprising universal sequence B isthe same as the universal sequence B of a second nucleic acid (e.g., afragment) comprising universal sequence B, and the universal sequence Cof a first nucleic acid (e.g., a fragment) comprising universal sequenceC is the same as the universal sequence C of a second nucleic acid(e.g., a fragment) comprising universal sequence C. While universalsequences A, B, and C are generally different in embodiments of thetechnology, they need not be. Thus, in some embodiments, universalsequences A and B are the same; in some embodiments, universal sequencesB and C are the same; in some embodiments, universal sequences A and Care the same; and in some embodiments, universal sequences A, B, and Care the same. In some embodiments, universal sequences A, B, and C aredifferent.

For example, if two regions of interest are to be sequenced (e.g., fromthe same or different sources or, e.g., from two different regions ofthe same nucleic acid, chromosome, gene, etc.), two primers may be used,one primer comprising a first template-specific sequence for primingfrom the first region of interest and a first barcode to associate thefirst amplified product with the first region of interest and a secondprimer comprising a second template-specific sequence for priming fromthe second region of interest and a second barcode to associate thesecond amplified product with the second region of interest. These twoprimers, however, in some embodiments, will comprise the same universalsequence (e.g., universal sequence A) for pooling and downstreamprocessing together. Two or more universal sequences may be used and, ingeneral, the number of universal sequences will be less than the numberof target-specific sequences and/or barcode sequences for pooling ofsamples and treatment of pools as a single sample (batch).

Accordingly, in some embodiments, determining the first nucleotidesubsequence and the second nucleotide subsequence comprises priming froma universal sequence. In some embodiments determining the firstnucleotide subsequence and the second nucleotide subsequence comprisesterminating polymerization with a 3′-O-blocked nucleotide analog. Forexample, in some embodiments determining the first nucleotidesubsequence and the second nucleotide subsequence comprises terminatingpolymerization with a 3′-O-alkynyl nucleotide analog, e.g., in someembodiments determining the first nucleotide subsequence and the secondnucleotide subsequence comprises terminating polymerization with a3′-O-propargyl nucleotide analog. In some embodiments determining thefirst nucleotide subsequence and the second nucleotide subsequencecomprises terminating polymerization with a nucleotide analog comprisinga reversible terminator.

The obtained short sequence reads are partitioned according to theirbarcode (i.e., de-multiplexed) and reads originating from the samesamples, sources, regions of interest, etc. are binned together, e.g.,saved to separate files or held in an organized data structure thatallows binned reads to be identified as such. Then the binned shortsequences are assembled into a consensus sequence. Sequence assembly cangenerally be divided into two broad categories: de novo assembly andreference genome mapping assembly. In de novo assembly, sequence readsare assembled together so that they form a new and previously unknownsequence. In reference genome mapping, sequence reads are assembledagainst an existing backbone sequence (e.g., a reference sequence, etc.)to build a sequence that is similar but not necessarily identical to thebackbone sequence.

Thus, in some embodiments, target nucleic acids corresponding to eachregion of interest are reconstructed using a de-novo assembly. To beginthe reconstruction process, short reads are stitched togetherbioinformatically by finding overlaps and extending them to produce aconsensus sequence. In some embodiments the method further comprisesmapping the consensus sequence to a reference sequence. Methods of thetechnology take advantage of sequencing quality scores that representbase calling confidence to reconstruct full length fragments. Inaddition to de-novo assembly, fragments can be used to obtain phasing(assignment to homologous copies of chromosomes) of genomic variants byobserving that consensus sequences originate from either one of thechromosomes.

As used herein, a “system” denotes a set of components, real orabstract, comprising a whole where each component interacts with or isrelated to at least one other component within the whole.

Various nucleic acid sequencing platforms, nucleic acid assembly and/ormapping systems (e.g., computer software and/or hardware) are described,e.g., in U.S. Pat. Appl. Pub. No. 2011/0270533, which is incorporatedherein by reference.

As used herein, the terms “alkyl” and the prefix “alk-” are inclusive ofboth straight chain and branched chain saturated or unsaturated groups,and of cyclic groups, e.g., cycloalkyl and cycloalkenyl groups. Unlessotherwise specified, acyclic alkyl groups are from 1 to 6 carbons.Cyclic groups can be monocyclic or polycyclic and preferably have from 3to 8 ring carbon atoms. Exemplary cyclic groups include cyclopropyl,cyclopentyl, cyclohexyl, and adamantyl groups. Alkyl groups may besubstituted with one or more substituents or unsubstituted. Exemplarysubstituents include alkoxy, aryloxy, sulfhydryl, alkylthio, arylthio,halogen, alkylsilyl, hydroxyl, fluoroalkyl, perfluoralkyl, amino,aminoalkyl, disubstituted amino, quaternary amino, hydroxyalkyl,carboxyalkyl, and carboxyl groups. When the prefix “alk” is used, thenumber of carbons contained in the alkyl chain is given by the rangethat directly precedes this term, with the number of carbons containedin the remainder of the group that includes this prefix definedelsewhere herein. For example, the term “C₁-C₄ alkaryl” exemplifies anaryl group of from 6 to 18 carbons (e.g., see below) attached to analkyl group of from 1 to 4 carbons.

As used herein, the term “alkoxy” refers to a chemical substituent ofthe formula —OR, where R is an alkyl group. By “aryloxy” is meant achemical substituent of the formula —OR′, where R′ is an aryl group.

As used herein, the term “alkyne” refers to a hydrocarbon comprising acarbon-carbon triple bond. One example of an alkyne-containingfunctional group is the propargyl group. Prop argyl is an alkylfunctional group of 2-propynyl with the structure HCEC≡CH₂—, derivedfrom the alkyne propyne.

As used herein, the term “azide” or “azido” refers to any compoundhaving the N₃— moiety therein. The azide may be an organic azide or ametal azide. One reaction involving azides is a type of click chemistryknown as a copper (I)-catalyzed 1,3-dipolar cyclo-addition reaction.This reaction conjugates alkynes and azides to form a five-memberedtriazole ring that provides a covalent linkage.

As used herein, the term “backbone” refers to a structural component ofa nucleic acid molecule that is a series of covalently bonded atoms thattogether create the continuous chain of the molecule. In “natural”nucleic acids the backbone comprises phosphodiester bonds linkingalternating sugars (e.g., ribose or deoxyribose) and phosphate moieties(related to phosphoric acid).

As used herein a “target site” is a site of a subject at which it isdesired for a bioactive agent to be delivered and to be active. A targetsite may be a cell, a cell type, a tissue, an organ, an area, or otherdesignation of a subject's anatomy and/or physiology.

The terms “protein” and “polypeptide” refer to compounds comprisingamino acids joined via peptide bonds and are used interchangeably.Conventional one and three-letter amino acid codes are used herein asfollows—Alanine: Ala, A; Arginine: Arg, R; Asparagine: Asn, N;Aspartate: Asp, D; Cysteine: Cys, C; Glutamate: Glu, E; Glutamine: Gln,Q; Glycine: Gly, G; Histidine: His, H; Isoleucine: Ile, I; Leucine: Leu,L; Lysine: Lys, K; Methionine: Met, M; Phenylalanine: Phe, F; Proline:Pro, P; Serine: Ser, S; Threonine: Thr, T; Tryptophan: Trp, W; Tyrosine:Tyr, Y; Valine Val, V. As used herein, the codes Xaa and X refer to anyamino acid.

In some embodiments compounds of the technology comprise an antibodycomponent or moiety, e.g., an antibody or fragments or derivativesthereof. As used herein, an “antibody”, also known as an“immunoglobulin” (e.g., IgG, IgM, IgA, IgD, IgE), comprises two heavychains linked to each other by disulfide bonds and two light chains,each of which is linked to a heavy chain by a disulfide bond. Thespecificity of an antibody resides in the structural complementaritybetween the antigen combining site of the antibody (or paratope) and theantigen determinant (or epitope). Antigen combining sites are made up ofresidues that are primarily from the hypervariable or complementaritydetermining regions (CDRs). Occasionally, residues from nonhypervariableor framework regions influence the overall domain structure and hencethe combining site. In some embodiments the targeting moiety is afragment of antibody, e.g., any protein or polypeptide-containingmolecule that comprises at least a portion of an immunoglobulin moleculesuch as to permit specific interaction between said molecule and anantigen. The portion of an immunoglobulin molecule may include, but isnot limited to, at least one complementarity determining region (CDR) ofa heavy or light chain or a ligand binding portion thereof, a heavychain or light chain variable region, a heavy chain or light chainconstant region, a framework region, or any portion thereof. Suchfragments may be produced by enzymatic cleavage, synthetic orrecombinant techniques, as known in the art and/or as described herein.Antibodies can also be produced in a variety of truncated forms usingantibody genes in which one or more stop codons have been introducedupstream of the natural stop site. The various portions of antibodiescan be joined together chemically by conventional techniques, or can beprepared as a contiguous protein using genetic engineering techniques.

Fragments of antibodies include, but are not limited to, Fab (e.g., bypapain digestion), F(ab′)2 (e.g., by pepsin digestion), Fab′ (e.g., bypepsin digestion and partial reduction) and Fv or scFv (e.g., bymolecular biology techniques) fragments.

A Fab fragment can be obtained by treating an antibody with the proteasepapaine. Also, the Fab may be produced by inserting DNA encoding a Fabof the antibody into a vector for prokaryotic expression system or foreukaryotic expression system, and introducing the vector into aprokaryote or eukaryote to express the Fab. A F(ab′)2 may be obtained bytreating an antibody with the protease pepsin. Also, the F(ab′)2 can beproduced by binding a Fab′ via a thioether bond or a disulfide bond. AFab may be obtained by treating F(ab′)2 with a reducing agent, e.g.,dithiothreitol. Also, a Fab′ can be produced by inserting DNA encoding aFab′ fragment of the antibody into an expression vector for a prokaryoteor an expression vector for a eukaryote, and introducing the vector intoa prokaryote or eukaryote for its expression. A Fv fragment may beproduced by restricted cleavage by pepsin, e.g., at 4° C. and pH 4.0. (amethod called “cold pepsin digestion”). The Fv fragment consists of theheavy chain variable domain (VH) and the light chain variable domain(VL) held together by strong noncovalent interaction. A scFv fragmentmay be produced by obtaining cDNA encoding the VH and VL domains aspreviously described, constructing DNA encoding scFv, inserting the DNAinto an expression vector for prokaryote or an expression vector foreukaryote, and then introducing the expression vector into a prokaryoteor eukaryote to express the scFv.

In general, antibodies can usually be raised to any antigen, using themany conventional techniques now well known in the art. Any targetingantibody to an antigen which is found in sufficient concentration at asite in the body of a mammal which is of diagnostic or therapeuticinterest can be used to make the compounds provided herein.

As used herein, the term “conjugated” refers to when one molecule oragent is physically or chemically coupled or adhered to another moleculeor agent. Examples of conjugation include covalent linkage andelectrostatic complexation. The terms “complexed,” “complexed with,” and“conjugated” are used interchangeably herein.

As used herein, the term “treatment” is defined as the application oradministration of a therapeutic agent described herein, or identified bya method described herein, to a patient, or application oradministration of the therapeutic agent to an isolated tissue or cellline from a patient, who has a disease, a symptom of disease or apredisposition toward a disease, with the purpose to cure, heal,alleviate, relieve, alter, remedy, ameliorate, improve or affect thedisease, the symptoms of disease, or the predisposition toward disease.

As a result of the selection of substituents and substituent patterns,certain of the compounds of the present technology can have asymmetriccenters and can occur as mixtures of stereoisomers, or as individualdiastereomers, or enantiomers. All isomeric forms of these compounds,whether isolated or in mixtures, are within the scope of the presenttechnology. Pharmaceutically acceptable salts include both the metallic(inorganic) salts and organic salts, a list of which is given inRemington's Pharmaceutical Sciences, 17th Edition, pg. 1418 (1985). Itis well known to one skilled in the art that an appropriate salt form ischosen based on physical and chemical properties. As will be understoodby those skilled in the art, pharmaceutically acceptable salts include,but are not limited to salts of inorganic acids such as hydrochloride,sulfate, phosphate, diphosphate, hydrobromide, and nitrate; or salts ofan organic acid such as malate, maleate, fumarate, tartrate, succinate,citrate, acetate, lactate, methanesulfonate, p-toluenesulfonate orpalmoate, salicylate, and stearate. Similarly pharmaceuticallyacceptable cations include, but are not limited to sodium, potassium,calcium, aluminum, lithium, and ammonium (especially ammonium salts withsecondary amines). Also included within the scope of this technology arecrystal forms, hydrates, and solvates.

Compositions according to the technology can be administered in the formof pharmaceutically acceptable salts. The term “pharmaceuticallyacceptable salt” refers to a salt that possesses the effectiveness ofthe parent compound and is not biologically or otherwise undesirable(e.g., is neither toxic nor otherwise deleterious to the recipientthereof). Suitable salts include acid addition salts that may, forexample, be formed by mixing a solution of the compound of the presenttechnology with a solution of a pharmaceutically acceptable acid such ashydrochloric acid, sulfuric acid, acetic acid, trifluoroacetic acid, orbenzoic acid. Certain of the compounds employed in the presenttechnology may carry an acidic moiety (e.g., COOH or a phenolic group),in which case suitable pharmaceutically acceptable salts thereof caninclude alkali metal salts (e.g., sodium or potassium salts), alkalineearth metal salts (e.g., calcium or magnesium salts), and salts formedwith suitable organic ligands such as quaternary ammonium salts. Also,in the case of an acid (COOH) or alcohol group being present,pharmaceutically acceptable esters can be employed to modify thesolubility or hydrolysis characteristics of the compound.

The term “administration” and variants thereof (e.g., “administering” acompound) in reference to a compound mean providing the compound or aprodrug of the compound to the individual in need of treatment orprophylaxis. When a compound of the technology or a prodrug thereof isprovided in combination with one or more other active agents,“administration” and its variants are each understood to includeprovision of the compound or prodrug and other agents at the same timeor at different times. When the agents of a combination are administeredat the same time, they can be administered together in a singlecomposition or they can be administered separately. As used herein, theterm “composition” is intended to encompass a product comprising thespecified ingredients in the specified amounts, as well as any productthat results, directly or indirectly, from combining the specifiedingredients in the specified amounts.

By “pharmaceutically acceptable” is meant that the ingredients of thepharmaceutical composition must be compatible with each other and notdeleterious to the recipient thereof.

The term “subject” as used herein refers to an animal, preferably amammal, most preferably a human, who has been the object of treatment,observation, or experiment.

The term “effective amount” as used herein means that amount of activecompound or pharmaceutical agent that elicits the biological ormedicinal response in a cell, tissue, organ, system, animal, or humanthat is being sought by a researcher, veterinarian, medical doctor, orother clinician. In some embodiments, the effective amount is a“therapeutically effective amount” for the alleviation of the symptomsof the disease or condition being treated. In some embodiments, theeffective amount is a “prophylactically effective amount” forprophylaxis of the symptoms of the disease or condition being prevented.The term also includes herein the amount of active compound sufficientto inhibit the mineralocorticoid receptor and thereby elicit a responsebeing sought (e.g., an “inhibition effective amount”). When the activecompound is administered as the salt, references to the amount of activeingredient are to the free form (the non-salt form) of the compound. Insome embodiments, this amount is between 1 mg and 1000 mg per day, e.g.,between 1 mg and 500 mg per day (between 1 mg and 200 mg per day).

In the method of the present technology, compounds, optionally in theform of a salt, can be administered by any means that produces contactof the active agent with the agent's site of action. They can beadministered by any conventional means available for use in conjunctionwith pharmaceuticals, either as individual therapeutic agents or in acombination of therapeutic agents. They can be administered alone, buttypically are administered with a pharmaceutical carrier selected on thebasis of the chosen route of administration and standard pharmaceuticalpractice. The compounds of the technology can, for example, beadministered orally, parenterally (including subcutaneous injections,intravenous, intramuscular, intrasternal injection, or infusiontechniques), by inhalation spray, or rectally, in the form of a unitdosage of a pharmaceutical composition containing an effective amount ofthe compound and conventional non-toxic pharmaceutically-acceptablecarriers, adjuvants, and vehicles. Liquid preparations suitable for oraladministration (e.g., suspensions, syrups, elixirs, and the like) can beprepared according to techniques known in the art and can employ any ofthe usual media such as water, glycols, oils, alcohols, and the like.Solid preparations suitable for oral administration (e.g., powders,pills, capsules, and tablets) can be prepared according to techniquesknown in the art and can employ such solid excipients as starches,sugars, kaolin, lubricants, binders, disintegrating agents, and thelike. Parenteral compositions can be prepared according to techniquesknown in the art and typically employ sterile water as a carrier andoptionally other ingredients, such as a solubility aid. Injectablesolutions can be prepared according to methods known in the art whereinthe carrier comprises a saline solution, a glucose solution, or asolution containing a mixture of saline and glucose. Further descriptionof methods suitable for use in preparing pharmaceutical compositions foruse in the present technology and of ingredients suitable for use insaid compositions is provided in Remington's Pharmaceutical Sciences,18th edition, edited by A. R. Gennaro, Mack Publishing Co., 1990.Compounds of the present technology can be made by a variety of methodsdepicted in the synthetic reaction schemes provided herein. The startingmaterials and reagents used in preparing these compounds generally areeither available from commercial suppliers, such as Aldrich ChemicalCo., or are prepared by methods known to those skilled in the artfollowing procedures set forth in references such as Fieser and Fieser'sReagents for Organic Synthesis, Wiley & Sons: New York, Volumes 1-21; R.C. LaRock, Comprehensive Organic Transformations, 2nd edition Wiley-VCH,New York 1999; Comprehensive Organic Synthesis, B. Trost and I. Fleming(Eds.) vol. 1-9 Pergamon, Oxford, 1991; Comprehensive HeterocyclicChemistry, A. R. Katritzky and C. W. Rees (Eds) Pergamon, Oxford 1984,vol. 1-9; Comprehensive Heterocyclic Chemistry II, A. R. Katritzky andC. W. Rees (Eds) Pergamon, Oxford 1996, vol. 1-11; and OrganicReactions, Wiley & Sons: New York, 1991, Volumes 1-40. The syntheticreaction schemes and examples provided herein are merely illustrative ofsome methods by which the compounds of the present technology can besynthesized, and various modifications to these synthetic reactionschemes can be made and will be suggested to one skilled in the arthaving referred to the disclosure contained in this application.

The starting materials and the intermediates of the synthetic reactionschemes can be isolated and purified if desired using conventionaltechniques, including but not limited to, filtration, distillation,crystallization, chromatography, and the like. Such materials can becharacterized using conventional means, including physical constants andspectral data.

Description

The technology described herein relates to nucleotide analogs andrelated methods, compositions (e.g., reaction mixtures), kits, andsystems for manipulating, detecting, isolating, and sequencing nucleicacids. In particular, some embodiments of the nucleotide analogscomprise an alkyne moiety that provides both terminating and linkingfunctionalities. The technology provides advantages over conventionalmethods such as a lower cost and reduced complexity.

1. Nucleotide Analogs

Provided herein are analogs of nucleotides. In some embodiments, thenucleotide analogs comprise one or more alkyne terminator moieties. Forexample, in some embodiments the technology provides a 3′-O-blockednucleotide analog that is a 3′-O-alkynyl nucleotide analog. In someembodiments, the 3′-O-blocked nucleotide analog is a 3′-O -propargylnucleotide analog having a structure as shown below:

wherein B is the base of the nucleotide, e.g., adenine, guanine,cytosine, thymine, or uracil, e.g., B is one of:

or a natural or synthetic nucleobase, e.g., a modified purine such ashypoxanthine, xanthine, 7-methylguanine; a modified pyrimidine such as5,6-dihydrouracil, 5-methylcytosine, 5-hydroxymethylcytosine; etc. andwherein P comprises a phosphate moiety (e.g., a monophosphate, adiphosphate, a triphosphate, a tetraphosphate); a 5′ hydroxyl; an alphathiophosphate (e.g., phosphorothioate or phosphorodithioate), a betathiophosphate (e.g., phosphorothioate or phosphorodithioate), and/or agamma thiophosphate (e.g., phosphorothioate or phosphorodithioate); oran alpha methylphosphonate, a beta methylphosphonate, and/or a gammamethylphosphonate, as defined herein.

The nucleotide analogs are not limited to a specific phosphate group. Insome embodiments, the phosphate group is a monophosphate group or apolyphosphate such as a diphosphate group, a triphosphate group, or atetraphosphate group. In some embodiments, the phosphate group is apyrophosphate. In some embodiments, P represents a group comprising a 5′hydroxyl; an alpha thiophosphate (e.g., phosphorothioate orphosphorodithioate), a beta thiophosphate (e.g., phosphorothioate orphosphorodithioate), and/or a gamma thiophosphate (e.g.,phosphorothioate or phosphorodithioate); or an alpha methylphosphonate,a beta methylphosphonate, and/or a gamma methylphosphonate.

Moreover, the base of the nucleotide analogs is not limited to aspecific base. In some embodiments, the base is an adenine, cytosine,guanine, thymine, uracil, and analogs thereof such as, for example,acyclic bases. The nucleotide analogs are not limited to a specificsugar moiety. In some embodiments, said sugar moiety is a ribose,deoxyribose, dideoxyribose, and analogs, derivatives, and/ormodifications thereof (e.g., a thiofuranose, thioribose,thiodeoxyribose, etc.). In some embodiments, the sugar moiety is anarabinose or other related carbohydrate.

In some embodiments, the nucleotide analog is a 3′-O-propargyl-dNTPwhere N is selected from the group consisting of A, C, G, T and U. Insome embodiments, the nucleotide analogs comprise detectable labels ortags such as an optically detectable moiety (e.g., a fluorescent dye),electrochemically detectable moieties (e.g., a redox active group), aquantum dot, a chromogen, a biological image contrast agent, a drugdelivery vehicle tag, etc.

The synthesis of compounds provided herein is performed as described in,e.g., Bentley et al. (2008) “Accurate whole genome sequencing usingreversible terminator chemistry” Nature 456(7218): 53-9 and Ju et al.(2006) “Four Color DNA Sequencing by synthesis using cleavablefluorescent nucleotide reversible terminators,” PNAS 103(52): 19635-40,incorporated herein by reference, with the modifications as needed toprovide the various nucleotide analogs described herein. Additionally,various molecular characterizations such as NMR, mass spectrometry, andchromatography/affinity analysis are used in some embodiments to confirmsuccessful synthesis of the correct compounds.

In some embodiments, synthetic methods for compounds encompassed andcontemplated by the technology described herein comprise one or more ofthe following synthetic schemes or modifications thereof:

Synthesis of 3′-O-propargyl dCTP

Synthesis of 3′-O-propargyi dTTP

Synthesis of 3′-O-propargyi dATP

Synthesis of 3′-O-propargyi dGTP

In some embodiments, the nucleotide analogs are used to incorporatealkyne moieties into nucleic acid polymers, e.g., by a polymerase. Insome embodiments, a polymerase is modified to enhance incorporation ofthe nucleotide analogs disclosed herein. Exemplary modified polymerasesare disclosed in U.S. Pat. Nos. 4,889,818; 5,374,553; 5,420,029;5,455,170; 5,466,591; 5,618,711; 5,624,833; 5,674,738; 5,789,224;5,795,762; 5,939,292; and U.S. Patent Publication Nos. 2002/0012970 and2004/0005599. A non-limiting example of a modified polymerase includesG46E E678G CS5 DNA polymerase, G46E E678G CS5 DNA polymerase, E615G TaqDNA polymerase, ΔZO5R polymerase, and G46E L329A E678G CS5 DNApolymerase disclosed in U.S. Patent Publication No. 2005/0037398. Insome embodiments, the polymerase is a Thermococcus sp. 9° N-7 polymerasesold under the trade name THERMINATOR (e.g., THERMINATOR II) by NewEngland BioLabs (Ipswich, Mass.). The production of modified polymerasescan be accomplished using many conventional techniques in molecularbiology and recombinant DNA described herein and known in the art. Insome embodiments, polymerase mutants, such as those described in U.S.Pat. No. 5,939,292, which incorporate NTPs as well as dNTPs are used.

In some embodiments the nucleotide analogs contain tags in addition toalkyne moieties (see supra). In some embodiments, nucleotide analogswith 3′ alkyne moieties are used to terminate a polymerase reaction.Chemical tags containing an azido moiety can then be appended to thenucleic acid polymer through click chemistry. In some embodiments, thereaction of the terminator alkyne compound with the azidomoiety-containing compound forms a triazole compound. In someembodiments, the triazole compound functions as a nucleic acid backboneand further enzymatic reactions such as PCR are performed on thetriazole compound.

2. Oligonucleotides

In some embodiments, the nucleotide analogs find use for the synthesisof triazole-backbone-modified nucleic acids (e.g., oligonucleotideanalogs). For example, the nucleotide analogs find use in methods foraqueous, solid-phase oligonucleotide synthesis. Such methods thusobviate the need for, inter alia, use of organic solvents, deprotectionsteps, and capping steps in some conventional syntheses; in addition,aqueous methods minimize or eliminate the undesired oxidation ofphosphorous in the synthesized compounds, e.g., during cycle synthesis.It is contemplated that an advantage of aqueous-phase synthesis is thatit is more rapid than conventional organic-phase synthesis techniques.

In some embodiments are provided a triazole-backbone-modifiedoligonucleotide comprising nucleotide analogs provided herein. That is,the nucleotide analogs described herein find use in the synthesis ofmodified oligonucleotides comprising one or more nucleotide analogs andcomprising triazole groups in the molecular backbone. In someembodiments, oligonucleotides comprise some conventional nucleotides andsome nucleotide analogs in various proportions. In some embodiments,oligonucleotides comprise only nucleotide analogs and do not compriseconventional nucleotides.

Accordingly, in some embodiments are provided a nucleotide analog asdescribed elsewhere herein, e.g., having a structure according to:

where B is the base of the nucleotide (e.g., adenine, guanine, thymine,cytosine, or a natural or synthetic nucleobase, e.g., a modified purinesuch as hypoxanthine, xanthine, 7-methylguanine; a modified pyrimidinesuch as 5,6-dihydrouracil, 5-methylcytosine, 5-hydroxymethylcytosine;etc.).

Such nucleotide analogs and variants and modified derivatives thereof(e.g., comprising a base analog or alternative sugar as describedherein) provide a directional, bi-functional nucleotide analog (e.g., adirectional, bi-functional polymerization agent), e.g., for thesynthesis of an oligonucleotide (e.g., an oligonucleotide analog, e.g.,an oligonucleotide comprising a nucleotide analog described herein). Insome embodiments, the directional, bi-functional nucleotide analogprovides for synthesis of an oligonucleotide in a 5′ to 3′ direction andin some embodiments the directional, bi-functional nucleotide analogprovides for synthesis of an oligonucleotide in a 3′ to 5′ direction. Insome embodiments, the synthesis of the oligonucleotide comprises use ofpropargyl moiety and a linker attached to a solid support (e.g., alinker (e.g., a carboxylate linker) that is cleavable under acidic(e.g., mildly acidic) conditions). In some embodiments, the synthesis ofthe oligonucleotide comprises use of a propargyl moiety and an azidelinker attached to a solid support. In some embodiments, a3′-thio-modified propargyl moiety is linked to the solid support and iscleaved with a reagent comprising silver nitrate or mercuric chloride.In some embodiments, the solid support comprises a controlled poreglass, silica, sephadex, agarose, acrylamide, latex, or polystyrene,etc., provided, in some embodiments, as microspheres.

Representative synthetic schemes for producing oligonucleotides areprovided as follows:

a. 3′ to 5′ oligonucleotide synthesis using 3′-alkynyl/5′-azidonucleotide analog

In exemplary synthetic scheme a, X is a solid support, the wavy line (˜)is a cleavable linker, B₁ is a first nucleotide base, and B₂ is a secondnucleotide base that may be the same or different than B₁. The firststep (1) links the first nucleotide analog to the solid support (e.g.,using a click chemistry reaction, e.g., using a copper-based catalyst).Then, one or more (e.g., multiple) rounds of the second step (2) (e.g.,using a click chemistry reaction, e.g., using a copper-based catalyst)result in synthesis of the oligonucleotide analog, with each step addinganother nucleotide analog to the growing polymer chain.

b. 5′ to 3′ oligonucleotide synthesis using 3′-alkynyl/5′-azidonucleotide analog

In exemplary synthetic scheme b, X is a solid support, the wavy line (˜)is a cleavable linker (e.g., a carboxylate linker), B₁ is a firstnucleotide base, and B₂ is a second nucleotide base that may be the sameor different than B₁. After reacting the first nucleotide analog withthe solid support comprising a linker and reactive carboxylate moiety(e.g., to form an ester link), one or more (e.g., multiple) rounds ofnucleotide addition and reaction (1) (e.g., using a click chemistryreaction, e.g., using a copper-based catalyst) result in synthesis ofthe oligonucleotide analog, with each step adding another nucleotideanalog to the growing polymer chain.

c. 5′ to 3′ oligonucleotide synthesis using 3′-azido/5′-alkynylnucleotide analog

In exemplary synthetic scheme c, X is a solid support, the wavy line (˜)is a cleavable linker, B₁ is a first nucleotide base, and B₂ is a secondnucleotide base that may be the same or different than B₁. The firststep (1) links the first nucleotide analog to the solid support (e.g.,using a click chemistry reaction, e.g., using a copper-based catalyst).Then, one or more (e.g., multiple) rounds of the second step (2) (e.g.,using a click chemistry reaction, e.g., using a copper-based catalyst)result in synthesis of the oligonucleotide analog, with each step addinganother nucleotide analog to the growing polymer chain.

d. 3′ to 5′ oligonucleotide synthesis using 3′-azido/5′-alkynylnucleotide analog

In exemplary synthetic scheme d, X is a solid support, the wavy line (˜)is a cleavable linker, B₁ is a first nucleotide base, and B2 is a secondnucleotide base that may be the same or different than B₁. The firststep (1) links the first nucleotide analog to the solid support (e.g.,using a click chemistry reaction, e.g., using a copper-based catalyst).Then, one or more (e.g., multiple) rounds of the second step (2) (e.g.,using a click chemistry reaction, e.g., using a copper-based catalyst)result in synthesis of the oligonucleotide analog, with each step addinganother nucleotide analog to the growing polymer chain.

In some embodiments, the oligonucleotide and/or nucleotide analog isreacted with a linker to attach the oligonucleotide and/or nucleotideanalog to a solid support, e.g., a bead, a planar surface (an array), acolumn, etc. The term “solid support” as used herein refers to amaterial or group of materials having a rigid or semi-rigid surface orsurfaces. In many embodiments, at least one surface of the solid supportis substantially flat, although in some embodiments it may be desirableto separate regions of the solid support with, for example, wells,raised regions, pins, etched trenches, or the like. According to otherembodiments, the solid support takes the form of beads, resins, gels,microspheres, or other geometric configurations. See, e.g., U.S. Pat.No. 5,744,305 and U.S. Pat. Pub. Nos. 20090149340 and 20080038559 forexemplary substrates. In some embodiments, the linker is a cleavablelinker (e.g., cleavable by light, heat, chemical, or biochemicalreaction).

In exemplary synthesis schemes a, b, c, and d, embodiments of methodsfor synthesizing an oligonucleotide comprise one or more additionalsteps of adding a nucleotide analog, reacting a nucleotide analog,washing away and/or otherwise removing an unincorporated nucleotideanalog (e.g., after a synthesis step), cleaving a linker, isolating asynthesized oligonucleotide, purifying a synthesized oligonucleotide,and/or adding a tag or a label to the synthesized oligonucleotide.

3. Tagging and Labeling

Nucleic acid detection methodologies serve a critical role in the fieldof molecular diagnostics. The ability to manipulate biomoleculesspecifically and efficiently has been the core driving force behind manysuccessful nucleic acid detection technologies. Among the many molecularbiology techniques, the ability to label or “tag” a biomolecule ofinterest has been a key technology for subsequent detection andidentification of the biomolecule.

Accordingly, in some embodiments the technology provides compositions,methods, systems, and kits related to tagging of biomolecules such asnucleic acids and/or nucleotides. In some embodiments, alkyne-containingnucleotides such as 3′-O-propargyl-modified nucleotides(e.g.,3′-O-propargyl dNTPs) are incorporated into a nucleic acid in apolymerase extension reaction. In some embodiments, the nucleotideanalog halts the polymerase reaction. In some embodiments, thealkyne-containing nucleotide is used (e.g., without further processingand/or purification) in a tagging reaction with an azide-modified tag orlabeling reagent using chemical ligation (e.g., a click chemistryreaction). The covalent linkage created using this chemistry mimicsnatural nucleic acid phosphodiester bonds, thus providing a conjugatedproduct that is suitable for use in subsequent enzymatic reactions suchas a polymerase chain reaction.

Labels and tags are compounds, structures, or elements that are amenableto at least one method of detection and/or isolation that allows fordiscrimination between different labels and/or tags. For example, labelsand/or tags comprise semiconductor nanocrystals, metal compounds,peptides, antibodies, small molecules, isotopes, particles, orstructures having different shapes, colors, barcodes, or diffractionpatterns associated therewith or embedded therein, strings of numbers,random fragments of proteins or nucleic acids, or different isotopes.

The term “label” or “tag” are used interchangeably herein to refer toany chemical moiety attached to a nucleotide or nucleic acid, whereinthe attachment may be covalent or non-covalent. Preferably, the label isdetectable and renders the nucleotide or nucleic acid detectable to thepractitioner of the technology. Exemplary detectable labels that finduse with the technology provided herein include, for example, afluorescent label, a chemiluminescent label, a quencher, a radioactivelabel, biotin, and gold, or combinations thereof. Detectable labelsinclude luminescent molecules, fluorochromes, fluorescent quenchingagents, colored molecules (e.g., chromogens used for in situhybridization (ISH, FISH) and bright field imaging applications),radioisotopes, or scintillants. Detectable labels also include anyuseful linker molecule (such as biotin, avidin, digoxigenin,streptavidin, HRP, protein A, protein G, antibodies or fragmentsthereof, Grb2, polyhistidine, Ni²⁺, FLAG tags, myc tags), heavy metals,enzymes (examples include alkaline phosphatase, peroxidase, andluciferase), electron donors/acceptors, acridinium esters, dyes, andcalorimetric substrates. It is also envisioned that a change in mass maybe considered a detectable label, e.g., as finds use in surface plasmonresonance detection.

The technology also finds use in applications such as linkingDNA-containing alkynes to an image contrast agent (e.g., meglumines,ferumoxsil, ferumoxides, gadodiamide, gadoversetamide, galliumcompounds, indium compounds, thallium compounds, rubidium compounds,technetium compounds, iopamidol, etc.), e.g., for biomedical imaging(e.g., magnetic resonance imaging (MRI), computed tomography (CT)scanning, X-ray, etc.), coupling DNA to oligo and/or antisensedrug-delivery vehicle tags (e.g., steroids, lipids, cholesterol,vitamins, hormones, carbohydrates, and/or receptor-specific ligands(e.g., folate, nicotinamide, acetylcholine, GABA, glutamate, serotonin,etc.), and coupling to chromogen moieties for in situ hybridizationapplications. The skilled artisan would readily recognize usefuldetectable labels that are not mentioned above, which may be employed inthe operation of the present invention.

As such, the technology is not limited in the label or tag that islinked to the nucleic acid, e.g., by use of an azide labeling reagent ina click chemistry reaction. Thus, in some embodiments, the labelcomprises a fluorescently detectable moiety that is based on a dye,wherein the dye is a xanthene, fluorescein, rhodamine, BODIPY, cyanine,coumarin, pyrene, phthalocyanine, phycobiliprotein, ALEXA FLUOR® 350,ALEXA FLUOR® 405, ALEXA FLUOR® 430, ALEXA FLUOR® 488, ALEXA FLUOR® 514,ALEXA FLUOR® 532, ALEXA FLUOR® 546, ALEXA FLUOR® 555, ALEXA FLUOR® 568,ALEXA FLUOR® 568, ALEXA FLUOR® 594, ALEXA FLUOR® 610, ALEXA FLUOR® 633,ALEXA FLUOR® 647, ALEXA FLUOR® 660, ALEXA FLUOR® 680, ALEXA FLUOR® 700,ALEXA FLUOR® 750, a fluorescent semiconductor crystal, or a squarainedye. In some embodiments, the tag or label comprises a radioisotope, aspin label, a quantum dot, or a bioluminescent moiety. In someembodiments, the label is a fluorescently detectable moiety as describedin, e.g., Haugland (September 2005) MOLECULAR PROBES HANDBOOK OFFLUORESCENT PROBES AND RESEARCH CHEMICALS (10th ed.), which is hereinincorporated by reference in its entirety.

In some embodiments the label (e.g., a fluorescently detectable label)is one available from ATTO-TEC GmbH (Am Eichenhang 50, 57076 Siegen,Germany), e.g., as described in U.S. Pat. Appl. Pub. Nos. 20110223677,20110190486, 20110172420, 20060179585, and 20030003486; and in U.S. Pat.No. 7,935,822, all of which are incorporated herein by reference.

In some embodiments, the nucleic acid and/or nucleotide comprising amodified nucleotide, e.g., comprising an alkyne group, is tagged with amoiety that provides for detection and/or isolation of the nucleic acidand/or nucleotide by specific interaction with a second moiety. Forexample, in some embodiments, the nucleic acid and/or nucleotide islinked (e.g., by a click chemistry reaction) to a tag comprising anazide and a biotin moiety, an epitope, an antigen, an aptamer, anaffinity tag, a histidine tag, a barcode oligonucleotide, a poly-A tail,a capture oligonucleotide, a protein, a sugar, a chelator, a mass tag(e.g., 2-nitro-methyl-benzyl group, a 2-nitro-methyl-3-fluorobenzylgroup, a 2-nitro-α-methyl-3,4-difluorobenzyl group, a2-nitro-α-methyl-3,4-dimethoxybenzyl group, a 2-nitro-α-methyl-benzylgroup, a 2-nitro-α-methyl-3-fluorobenzyl group, a2-nitro-methyl-3,4-difluorobenzyl group, a2-nitro-α-methyl-3,4-dimethoxybenzyl), a charge tag.

In some embodiments, the nucleic acid and/or nucleotide comprising analkyne is reacted with a linker comprising an azide to attach thenucleic acid and/or nucleotide to a solid support, e.g., a bead, aplanar surface (an array), a column, etc. In some embodiments, thelinker is a cleavable linker (e.g., cleavable by light, heat, chemical,or biochemical reaction).

4. Reactions

In some embodiments, the technology finds use in linking anoligonucleotide to a nucleic acid (e.g., a DNA, an RNA). For example, insome embodiments, a nucleic acid comprising a nucleotide analog (e.g., anucleic acid comprising an alkyne group, e.g., a 3′-O-propargylnucleotide, e.g., a 3′-O-propargyl dNTP) is linked to an oligonucleotidecomprising a group (e.g., an azide group) that is chemically reactivewith the chemical moiety of the nucleotide analog, e.g., by a clickchemistry reaction. In some embodiments, the oligonucleotide issingle-stranded and in some embodiments the oligonucleotide isdouble-stranded. In some embodiments the nucleic acid is a DNA and insome embodiments the nucleic acid is an RNA; in some embodiments theoligonucleotide is a DNA and in some embodiments the oligonucleotide isan RNA.

In some embodiments, methods of the technology involve attaching anadaptor to a nucleic acid. In some embodiments an adaptor comprises afunctional moiety for chemical ligation to a nucleotide analog. Forexample, in some embodiments an adaptor comprises an azide group (e.g.,at the 5′ end) that is reactive with an alkynyl group (e.g., a propargylgroup, e.g., at the 3′ end of a nucleic acid comprising the nucleotideanalog), e.g., by a click chemistry reaction (e.g., using a copper(e.g., a copper-based) catalyst reagent).

In some embodiments the alkyne is a butargyl group or a structuralderivative thereof. In some embodiments the alkyne comprises a sulfuratom, e.g., to provide a thio-alkynyl, a thio-propargyl (e.g.3′-S-propargyl) group, or a structural derivative thereof.

In some embodiments, the adaptors comprise a universal sequence and/oran index, e.g., a barcode nucleotide sequence. Additionally, adaptorscan contain one or more of a variety of sequence elements, including butnot limited to, one or more amplification primer annealing sequences orcomplements thereof, one or more sequencing primer annealing sequencesor complements thereof, one or more barcode sequences, one or morecommon sequences shared among multiple different adaptors or subsets ofdifferent adaptors (e.g., a universal sequence), one or more restrictionenzyme recognition sites, one or more overhangs complementary to one ormore target polynucleotide overhangs, one or more probe binding sites(e.g. for attachment to a sequencing platform, such as a flow cell formassive parallel sequencing, such as developed by Illumina, Inc.), oneor more random or near-random sequences (e.g. one or more nucleotidesselected at random from a set of two or more different nucleotides atone or more positions, with each of the different nucleotides selectedat one or more positions represented in a pool of adaptors comprisingthe random sequence), and combinations thereof. Two or more sequenceelements can be non-adjacent to one another (e.g. separated by one ormore nucleotides), adjacent to one another, partially overlapping, orcompletely overlapping. For example, an amplification primer annealingsequence can also serve as a sequencing primer annealing sequence.Sequence elements can be located at or near the 3′ end, at or near the5′ end, or in the interior of the adaptor oligonucleotide. When anadaptor oligonucleotide is capable of forming secondary structure, suchas a hairpin, sequence elements can be located partially or completelyoutside the secondary structure, partially or completely inside thesecondary structure, or in between sequences participating in thesecondary structure. For example, when an adaptor oligonucleotidecomprises a hairpin structure, sequence elements can be locatedpartially or completely inside or outside the hybridizable sequences(the “stem”), including in the sequence between the hybridizablesequences (the “loop”). In some embodiments, the adaptoroligonucleotides in a plurality of adaptor oligonucleotides havingdifferent barcode sequences comprise a sequence element common among alladaptor oligonucleotides in the plurality. A difference in sequenceelements can be any such that at least a portion of different adaptorsdo not completely align, for example, due to changes in sequence length,deletion or insertion of one or more nucleotides, or a change in thenucleotide composition at one or more nucleotide positions (such as abase change or base modification). In some embodiments, an adaptoroligonucleotide comprises a 5′ overhang, a 3′ overhang, or both that iscomplementary to one or more target polynucleotides. Complementaryoverhangs can be one or more nucleotides in length, including but notlimited to 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or morenucleotides in length. Complementary overhangs may comprise a fixedsequence. Complementary overhangs may comprise a random sequence of oneor more nucleotides, such that one or more nucleotides are selected atrandom from a set of two or more different nucleotides at one or morepositions, with each of the different nucleotides selected at one ormore positions represented in a pool of adaptors with complementaryoverhangs comprising the random sequence. In some embodiments, anadaptor overhang is complementary to a target polynucleotide overhangproduced by restriction endonuclease digestion. In some embodiments, anadaptor overhang consists of an adenine or a thymine.

In some embodiments, the adaptor sequences contain a molecular bindingsite identification element to facilitate identification and isolationof the target nucleic acid for downstream applications. Molecularbinding as an affinity mechanism allows for the interaction between twomolecules to result in a stable association complex. Molecules that canparticipate in molecular binding reactions include proteins, nucleicacids, carbohydrates, lipids, and small organic molecules such asligands, peptides, or drugs.

When a nucleic acid molecular binding site is used as part of theadaptor, it can be used to employ selective hybridization to isolate atarget sequence. Selective hybridization may restrict substantialhybridization to target nucleic acids containing the adaptor with themolecular binding site and capture nucleic acids, which are sufficientlycomplementary to the molecular binding site. Thus, through “selectivehybridization” one can detect the presence of the target polynucleotidein an impure sample containing a pool of many nucleic acids. An exampleof a nucleotide-nucleotide selective hybridization isolation systemcomprises a system with several capture nucleotides, which arecomplementary sequences to the molecular binding identificationelements, and are optionally immobilized to a solid support. In otherembodiments, the capture polynucleotides are complementary to the targetsequences itself or a barcode or unique tag contained within theadaptor. The capture polynucleotides can be immobilized to various solidsupports, such as inside of a well of a plate, mono-dispersed spheres,microarrays, or any other suitable support surface known in the art. Thehybridized complementary adaptor polynucleotides attached on the solidsupport can be isolated by washing away the undesirable non-bindingnucleic acids, leaving the desirable target polynucleotides behind. Ifcomplementary adaptor molecules are fixed to paramagnetic spheres orsimilar bead technology for isolation, then spheres can then be mixed ina tube together with the target polynucleotide containing the adaptors.When the adaptor sequences have been hybridized with the complementarysequences fixed to the spheres, undesirable molecules can be washed awaywhile spheres are kept in the tube with a magnet or similar agent. Thedesired target molecules can be subsequently released by increasing thetemperature, changing the pH, or by using any other suitable elutionmethod known in the art.

As described elsewhere herein, a “barcode” or “barcode oligonucleotide”is a known nucleic acid sequence that allows some feature of a nucleicacid with which the barcode is associated to be identified. For example,in some embodiments, the feature of the nucleic acid to be identified isthe sample or source from which the nucleic acid is derived. The barcodesequence generally includes certain features that make the sequenceuseful, e.g., in sequencing reactions. For example, the barcodesequences are designed to have minimal or no homopolymer regions, e.g.,2 or more of the same base in a row such as AA or CCC, within thebarcode sequence. In some embodiments, the barcode sequences are alsodesigned so that they are at least one edit distance away from the baseaddition order when performing a manipulation or molecular biologicalprocess, such as base-by-base sequencing, ensuring that the first andlast bases do not match the expected bases of the sequence.

In some embodiments, the barcode sequences are designed such that eachsequence is correlated to a particular nucleic acid. Methods ofdesigning sets of barcode sequences are shown, for example, in U.S. Pat.No. 6,235,475, the contents of which are incorporated by referenceherein in their entirety. In some embodiments, barcode sequences rangefrom about 5 nucleotides to about 15 nucleotides. In a particularembodiment, the barcode sequences range from about 4 nucleotides toabout 7 nucleotides. In some embodiments, lengths and sequences ofbarcode sequences are designed to achieve a desired level of accuracy ofdetermining the identity of a nucleic acid. For example, in someembodiments barcode sequences are designed such that after a tolerablenumber of point mutations, the identity of the associated nucleic acidis deduced with a desired accuracy. In some embodiments, a Tn-5transposase (commercially available from Epicentre Biotechnologies;Madison, Wis.) cuts a nucleic acid into fragments and inserts shortpieces of DNA into the cuts. The short pieces of DNA are used toincorporate the barcode sequences.

Methods for designing sets of barcode sequences and other methods forattaching adaptors (e.g., comprising barcode sequences) are shown inU.S. Pat. Nos. 6,138,077; 6,352,828; 5,636,400; 6,172,214; 6235,475;7,393,665; 7,544,473; 5,846,719; 5,695,934; 5,604,097; 6,150,516;RE39,793; 7,537,897; 6172,218; and 5,863,722, the content of each ofwhich is incorporated by reference herein in its entirety.

With appropriate changes to reaction schemes, use of 5′ alkynyl/3′ azidoand 5′ azido/3′ alkynyl nucleotide analogs are contemplated to beinterchangeable in reactions with the appropriate reactive substratesfor linking to the 5′ and/or 3′ ends of nucleotide analogs, e.g., byclick chemistry.

In some embodiments, the technology finds use in a primer extensionreaction (see, e.g., FIG. 1) and/or adaptor ligation (see, e.g., FIG.1). In particular embodiments, a primer annealed to a template (e.g., atarget nucleic acid) is extended by a polymerase, which adds anucleotide analog to the primer. While FIG. 1 shows the exemplaryaddition of a G-containing nucleotide analog across from the C base inthe template, the primer extension technology is not limited in thebases that are added. Then, in some embodiments, an azide-modified DNA(e.g., an adaptor, e.g., an adaptor comprising a primer binding siteand/or a barcode) is ligated to the primer extension product (e.g., byclick chemistry). The ligation product comprises a linkage that mimicsthe conventional nucleic acid backbone, e.g., a triazole, and that isbiocompatible with downstream enzymatic and/or chemical reactions, e.g.,PCR (e.g., see FIG. 1).

5. Sequencing

In some embodiments, the nucleotide analogs find use in nucleic acidsequencing, e.g., “next generation sequencing” (NGS). For example, DNAsequencing-by-synthesis (SBS) involves determining DNA sequence bydetecting certain signals (e.g., pyrophosphate groups) that aregenerated when a nucleotide is incorporated by a polymerase reaction.Other SBS methods involve alternate means of detecting the polymeraseaddition of nucleotides such as detection of light emission, change influorescence, chance in pH, or some other physical or chemical change.For example, Illumina's reversible terminator sequencing relies upondye-containing reversible terminator bases. When one such base is addedto the growing nucleic acid polymer, the reaction is halted and the dyeon the terminal nucleic acid is detectable. The terminator-containingmolecule can then be treated with a cleavage enzyme that reverses thetermination and allows for the addition of additional nucleotides. Thisstep-wise process is an improvement on earlier technology, but the extracleavage step and subsequent sample purification leave room for furtherimprovement.

In some embodiments, the present invention provides functionalterminator nucleotides containing 3′-alkynes that are incorporated intoa growing nucleic acid polymer and terminate the extension reaction. The3′-alkyne can be immediately used in a reaction with an azide-modifiedtag through click chemistry. The linkage created through click chemistrymimics a natural nucleic acid phosphodiester bond and provides for theuse of the conjugated product in subsequent enzymatic reactions such asPCR. In this way, some embodiments of the present invention eschew theterminator cleavage step of the reversible terminator sequencingreaction and thereby decrease the run time of the reaction (see, e.g.,the embodiment depicted in FIG. 2).

In some embodiments, a nucleotide analog, e.g., a 3′-alkynyl nucleotideanalog (e.g., a 3′-O-propargyl nucleotide analog such as a3′-O-propargyl dNTP) is used in a polymerase reaction and nucleic acidextension products are made in which the 3′ end comprises an alkynegroup. The alkyne-modified nucleic acid products can subsequently beused as a specific substrate in chemical ligation reactions withcompounds containing azido moieties through click chemistry (e.g., acopper(I)-catalyzed 1,3-dipolar cyclo-addition reaction). This type ofclick chemistry conjugates alkynes and azides, forming a covalentlinkage (e.g., a five-membered triazole ring) between thealkyne-containing compound and the azide-containing compound. Forexample, a 5′-azide-modified DNA fragment can be chemically ligated to a3′-alkyne-modified DNA fragment using click chemistry. This conjugatedDNA product can then be used as input in subsequent enzymatic reactionssuch as PCR or sequencing because the covalent linkage created by thefive-membered triazole ring mimics the natural phosphodiester bond ofthe DNA backbone and does not significantly and/or detectably inhibitsubsequent enzymatic activities.

The contemplated reactions involving the nucleotide analogs providemultiple potential detection events. In some embodiments, the nucleotideanalog incorporates a specific fluorophore into the elongating nucleicacid strand. In some embodiments, the addition of the nucleotide analogcreates a detectable signal such as pyrophosphate. In some embodiments,the incorporation of the nucleotide analog can be detected by emissionof light, change in fluorescence, change in pH, change in conformation,or some other chemical change. In some embodiments the click chemistryreaction between the incorporated nucleotide analog and a compoundcomprising an azido moiety can be detected in ways similar to theincorporation of the nucleotide analog.

Because of the click chemistry, the alkyne-containing nucleotide analogsreadily react with compounds containing azido moieties. Using this clickchemistry, various tags can be inserted covalently into an elongatingnucleic acid strand that contains one of the nucleotide analogs.Examples of such tags include but are not limited to fluorescent dyes,DNA, RNA, oligonucleotides, nucleosides, proteins, amino acids,polypeptides, polysaccharides, nucleic acid, synthetic polymers, andviruses.

The technology relates in some embodiments to methods for sequencing anucleic acid. In some embodiments, sequencing is performed by thefollowing sequence of events with the exemplary use of a nucleotideanalog comprising a 3′-O-propargyl moiety. First, the nucleotide analogis oriented in the polymerase active site (e.g., by a polymerase) to bebase-paired to a complementary base of the template strand and to beadjacent to the free 3′ hydroxyl of the growing synthesized strand.Next, the nucleotide analog is added to the 3′ end of a growing strandby the polymerase, e.g., by the enzyme-catalyzed attack of the 3′hydroxyl on the alpha-phosphate of the nucleotide analog. Furtherextension of the strand by the polymerase is blocked by the3′-O-propargyl terminating group on the incorporated nucleotide analog.In some embodiments, the strand is then subjected to a PCR reaction andused in various sequencing methods.

In some embodiments, the 3′-O-propargyl terminating moiety is treatedwith an azide-tagged DNA molecule. This removes the terminator alkyne.Once the terminator has been removed the growing strand is free forfurther polymerization: the next base is incorporated to continueanother cycle, e.g., a nucleotide analog is oriented in the polymeraseactive site, the nucleotide analog is added to the 3′ end of the growingstrand by the polymerase, and the nucleotide analog is queried toidentify the base added.

Some embodiments relate to parallel (e.g., massively parallel)sequencing.

In some embodiments, the technology described herein is related to amethod for sequencing nucleic acid comprising: hybridizing a primer to anucleic acid to form a hybridized primer/nucleic acid complex, providinga plurality of nucleotide analogs, each nucleotide analog comprising anucleotide and an alkyne moiety attached to the nucleotide, reacting thehybridized primer/nucleic acid complex and the nucleotide analog with apolymerase to add the nucleotide analog to the primer by a polymerasereaction to form an extended product comprising an incorporatednucleotide analog, querying the extended product to identify theincorporated nucleotide analog, reacting the extended product with anazide-containing compound to form a structure comprising a triazolering. In some embodiments the nucleotide analogs comprise3′-O-propargyl-dNTP where N is selected from the group consisting of A,C, G, T and U. In some embodiments, the nucleic acid conjugatecomprising a triazole ring is used in subsequent enzymatic reactionssuch as polymerase chain reaction. In some embodiments, the methodincludes providing conventional nucleotides during the same step thatthe nucleotide analogs are provided.

In some embodiments, the technology described herein provides a methodfor sequencing a nucleic acid comprising: hybridizing a primer to anucleic acid to form a hybridized primer/nucleic acid complex, providinga plurality of nucleotides (some of which are nucleotide analogscomprising a nucleotide and an alkyne moiety attached to thenucleotide), reacting the hybridized primer/nucleic acid complex and thenucleotide analog with a polymerase to add the nucleotide analog to theprimer by a polymerase reaction to form an extended product comprisingan incorporated nucleotide analog, and querying the structure comprisinga triazole ring to identify which analog nucleotide was incorporated.

In some embodiments, the methods further comprise reacting the extendedproduct with an azide-containing compound to form a structure comprisinga triazole ring. In some embodiments the nucleotide analogs comprise3′-O-propargyl-dNTP where N is selected from the group consisting of A,C, G, T and U. In some embodiments, the structure comprising a triazolering is used in subsequent enzymatic reactions such as polymerase chainreaction. In some embodiments, the method includes providingconventional nucleotides during the same step that the nucleotideanalogs are provided.

In some particular embodiments comprising use of a polymerase toincorporate the nucleotide analogs into a nucleic acid (e.g., PCR,primer extension, DNA sequencing (e.g., NGS), single-base extension,etc.), the polymerase is a polymerase obtained from, derived from,isolated from, cloned from, etc. a Thermococcus species (e.g., anorganism of the taxonomic lineage Archaea; Euryarchaeota; Thermococci;Thermococcales; Thermococcaceae; Thermococcus). In some embodiments, thepolymerase is obtained from, derived from, isolated from, cloned from,etc. a Thermococcus species 9° N-7 (see, e.g., Southworth, et al. (1996)“Cloning of thermostable DNA polymerases from hyperthermophilic marineArchaea with emphasis on Thermococcus sp. 9° N-7 and mutations affecting3′-5′ exonuclease activity” Proc. Natl. Acad. Sci. USA 93: 5281,incorporated herein by reference in its entirety). The nucleotidesequence encoding the wild type Thermococcus sp. 9° N-7 polymerase isprovided by GenBank Accession Number U47108 (e.g., the polymerase genestarts at nucleotide 40 of Accession Number U47108) and the amino acidsequence of the wild type Thermococcus sp. 9° N-7 polymerase is providedby GenBank Accession Number AAA88769.

In some embodiments, the polymerase comprises amino acid substitutionsthat provide for improved incorporation of modified substrates such asmodified dideoxynucleotides, ribonucleotides, and acyclonucleotides. Insome embodiments, the polymerase comprises amino acid substitutions thatprovide for improved incorporation of nucleotide analogs comprisingmodified 3′ functional groups such as the 3′-O-propargyl dNTPs describedherein. In some embodiments the amino acid sequence of the polymerasecomprises one or more amino acid substitutions relative to theThermococcus sp. 9° N-7 wild-type polymerase amino acid sequence, e.g.,a substitution of alanine for the aspartic acid at amino acid position141 (D141A), a substitution of alanine for the glutamic acid at aminoacid position 143 (E143A), a substitution of valine for the tyrosine atamino acid position 409 (Y409V), and/or a substitution of leucine forthe alanine at amino acid position 485 (A485L).

In some embodiments, the polymerase is provided in a heterologous hostorganism such as Escherichia coil that comprises a cloned Thermococcussp. 9° N-7 polymerase gene, e.g., comprising one or more mutations(e.g., D141A, E143A, Y409V, and/or A485L). In some embodiments, thepolymerase is a Thermococcus sp. 9° N-7 polymerase sold under the tradename THERMINATOR (e.g., THERMINATOR II) by New England BioLabs (Ipswich,Mass.).

In some embodiments, methods for producing polymerase mutants andscreening their activities (e.g., incorporation of modified nucleotides)are described in, e.g., Gardner and Jack (1999) “Determinants ofnucleotide sugar recognition in an archaeon DNA polymerase” NucleicAcids Research 27(12): 2545, incorporated by reference herein in itsentirety. In particular, methods for producing and identifyingpolymerase mutants that incorporate modified nucleotides are providedby, e.g., Gardner and Jack (2002) “Acyclic and dideoxy terminatorpreferences denote divergent sugar recognition by archaeon and Taq DNApolymerases” Nucleic Acids Research 30(2): 605, incorporated byreference herein in its entirety. Additional assays for characterizingthe incorporation of modified nucleotides by various polymerases aredescribed, e.g., in Ruparel et al. (2005). Proc. Natl. Acad. Sci. USA.26: 5932; Barnes (1978). J. Mol. Biol. 119: 83; Sanger et al. (1977).Proc. Natl. Acad. Sci. USA. 74: 5463; Haff and Simirnov (1997) GenomeMethods 7: 378; and in U.S. Pat. No. 5,558,991, each incorporated hereinby reference in entirety.

6. Uses

The nucleotide analogs provided herein find use in a wide range ofapplications. Non-limiting examples of uses for the nucleotide analogsdescribed include use as antiviral and/or anticancer agents. In someembodiments, the nucleotide analogs provided herein find use indiagnostic medical imaging, e.g., as contrast agents for use in, e.g.,MRI, computed tomography (CT) scans, X-ray imaging, angiography (e.g.,venography, digital subtraction angiography (DSA), arteriography),intravenous urography, intravenous pyelography, myelography,interventional medicine (e.g., angioplasty (e.g., percutaneoustransluminal angioplasty), artery ablation and/or occlusion (e.g., totreat cancer and/or vascular abnormalities), and placement of stents),arthrography, sialography, retrograde choledocho-pancreatography,micturating cystography, etc. Additional illustrative and non-limitinguses for such contrast agents include in vivo imaging for humandiagnostics, drug discovery, and drug development in model systems(mouse models, etc.).

In some embodiments, an oligonucleotide comprising one or morenucleotide analogs described herein finds use in a nanoconjugate (e.g.,comprising nanoparticles such as titanium dioxide nanoparticles, anoligonucleotide (e.g., comprising a nucleotide analog), and/or acontrast agent (e.g., a heavy metal contrast agent such as gadolinium))for use in imaging and/or therapy (e.g., neutron-capture cancertherapy). See, e.g., Paunesku et al. Nanomedicine 4(3): 201-7, 2008.

In some embodiments, the technology finds use as a drug delivery tag,e.g., for the targeted cellular delivery of oligonucleotide andantisense therapeutics (e.g., siRNA, miRNA, etc.). In some embodiments,the technology finds use for the delivery of drugs linked to a nucleicacid comprising a nucleotide analog, wherein the nucleic acid serves asa targeting moiety. In some embodiments, the technology comprises use ofa cell targeting moiety to direct and/or deliver an oligonucleotide to aparticular cell, tissue, organ, etc. The cell targeting moiety imbuescompounds (e.g., an oligonucleotide (e.g., oligonucleotide analog)according to the technology described herein linked to acell-targeting/drug delivery moiety, e.g., as described below) withcharacteristics such that the compounds and/or oligonucleotides arepreferably recognized, bound, imported, processed, activated, etc. byone or more target cell types relative to one or more other non-targetcell types. For example, endothelial cells have a high affinity for thepeptide targeting moiety Arg-Gly-Asp (RGD), cancer and kidney cellspreferentially interact with compounds having a folic acid moiety,immune cells have an affinity for mannose, and cardiomyocytes have anaffinity for the peptide CWLSEAGPVVTVRALRGTGSW (see, e.g., Biomaterials31: 8081-8087, 2010). Other targeting/delivery moieties are known in theart. Accordingly, compounds comprising a targeting moiety preferentiallyinteract with and are taken up by the targeted cell type.

In some embodiments, the compounds comprise an RGD peptide. RGD peptidescomprise 4 to 30 (e.g., 5 to 20 or 5 to 15) amino acids and target tumorcells (e.g., endothelial tumor cells). Such peptides and agents derivedtherefrom are known in the art, and are described by Beer et al. inMethods Mol. Biol. 680: 183-200 (2011) and in Theranostics 1: 48-57(2011); by Morrison et al. in Theranostics 1: 149-153 (2011); by Zhou etal. in Theranostics 1: 58-82 (2011); and by Auzzas et al. in Curr. Med.Chem. 17: 1255-1299 (2010).

In some embodiments, the targeting moiety is folic acid, e.g., fortargeting to cells expressing the folate receptor. The folate receptoris overexpressed on the cell surfaces of human cancer cells in, e.g.,cancers of the brain, kidney, lung, ovary, and breast relative to lowerlevels in normal cells (see, e.g., Sudimack J, et al. 2000 “Targeteddrug delivery via the folate receptor” Adv Drug Deliv Rev 41: 147-162).

In some embodiments, the targeting moiety comprises transferrin, whichtargets the compounds to, e.g., macrophages, erythroid precursors inbone marrow, and cancer cells. When a transferrin protein encounters atransferrin receptor on the surface of a cell, the transferrin receptorbinds to the transferrin and transports the transferrin into the cell.Drugs and other compounds and/or moieties linked to the tranferrin arealso transported to the cell and, in some cases, imported into thecells. In some embodiments, a fragment of a transferrin targets thecompounds of the technology to the target cell. See, e.g., Qian et al.(2002) “Targeted drug delivery via the transferrin receptor-mediatedendocytosis pathway”, Pharmacol Rev. 54: 561-87; Daniels et al. (2006)“The transferrin receptor part I: Biology and targeting with cytotoxicantibodies for the treatment of cancer”, Clin. Immunol. 121: 144-58.

In some embodiments, the targeting moiety comprises the peptide VHSPNKK.This peptide targets compounds to cells expressing vascular celladhesion molecule 1 (VCAM-1), e.g., to activated endothelial cells.Targeting activated endothelial cells finds use, e.g., in delivery oftherapeutic agents to cells for treatment of inflammation and cancer.Certain melanoma cells use VCAM-1 to adhere to the endothelium andVCAM-1 participates in monocyte recruitment to atherosclerotic sites.Accordingly, the peptide VHSPNKK finds use in targeting compounds of thepresent technology to cancer (e.g., melanoma) cells and atheroscleroticsites.

See, e.g., Lochmann, et al. (2004) “Drug delivery of oligonucleotides bypeptides” Eur. J. Pharmaceutics and Biopharmaceutics 58: 237-251,incorporated herein by reference, discussing targeting moieties and thecells targeted by those moieties.

In some embodiments, the cell-targeting moiety comprises an antibody, orderivative or fragment thereof. Antibodies to cell-specific moleculessuch as, e.g., proteins (e.g., cell—surface proteins, membrane proteins,proteoglycans, glycoproteins, peptides, and the like); polynucleotides(nucleic acids, nucleotides); lipids (e.g., phospholipids, glycolipids,and the like), or fragments thereof comprising an epitope or antigenspecifically recognized by the antibody, target compounds according tothe technology to the cells expressing the cell-specific molecules.

For example, many antibodies and antibody fragments specifically bindmarkers produced by or associated with tumors or infectious lesions,including viral, bacterial, fungal, and parasitic infections, andantigens and products associated with such microorganisms (see, e.g.,U.S. Pat. Nos. 3,927,193; 4,331,647; 4348,376; 4,361,544; 4,468,457;4,444,744; 4,460,459; 4,460,561; 4,818,709; and 4,624,846, incorporatedherein by reference) Moreover, antibodies that target myocardialinfarctions are disclosed in, e.g., U.S. Pat. No. 4,036,945. Antibodiesthat target normal tissues or organs are disclosed in, e.g., U.S. Pat.No. 4,735,210. Anti-fibrin antibodies are known in the art, as areantibodies that bind to atherosclerotic plaque and to lymphocyteautoreactive clones.

For cancer (e.g., breast cancer) and its metastases, a specific markeror markers may be chosen from cell surface markers such as, for example,members of the MUC-type mucin family, an epithelial growth factor (EGFR)receptor, a carcinoembryonic antigen (CEA), a human carcinoma antigen, avascular endothelial growth factor (VEGF) antigen, a melanoma antigen(MAGE) gene, family antigen, a T/Tn antigen, a hormone receptor, growthfactor receptors, a cluster designation/differentiation (CD) antigen, atumor suppressor gene, a cell cycle regulator, an oncogene, an oncogenereceptor, a proliferation marker, an adhesion molecule, a proteinaseinvolved in degradation of extracellular matrix, a malignanttransformation related factor, an apoptosis related factor, a humancarcinoma antigen, glycoprotein antigens, DF3, 4F2, MGFM antigens,breast tumor antigen CA 15-3, calponin, cathepsin, CD 31 antigen,proliferating cell nuclear antigen 10 (PC 10), and pS2. For other formsof cancer and their metastases, a specific marker or markers may beselected from cell surface markers such as, for example, vascularendothelial growth factor receptor (VEGFR) family, a member ofcarcinoembryonic antigen (CEA) family, a type of anti-idiotypic mAB, atype of ganglioside mimic, a member of cluster designationdifferentiation antigens, a member of epidermal growth factor receptor(EGFR) family, a type of a cellular adhesion molecule, a member ofMUC-type mucin family, a type of cancer antigen (CA), a type of a matrixmetalloproteinase, a type of glycoprotein antigen, a type of melanomaassociated antigen (MAA), a proteolytic enzyme, a calmodulin, a memberof tumor necrosis factor (TNF) receptor family, a type of angiogenesismarker, a melanoma antigen recognized by T cells (MART) antigen, amember of melanoma antigen encoding gene (MAGE) family, a prostatemembrane specific antigen (PMSA), a small cell lung carcinoma antigen(SCLCA), a T/Tn antigen, a hormone receptor, a tumor suppressor geneantigen, a cell cycle regulator antigen, an oncogene antigen, anoncogene receptor antigen, a proliferation marker, a proteinase involvedin degradation of extracellular matrix, a malignant transformationrelated factor, an apoptosis-related factor, and a type of humancarcinoma antigen.

The antibody may have an affinity for a target associated with a diseaseof the immune system such as, for example, a protein, a cytokine, achemokine, an infectious organism, and the like. In another embodiment,the antibody may be targeted to a predetermined target associated with apathogen-borne condition. The particular target and the antibody may bespecific to, but not limited to, the type of the pathogen-bornecondition. A pathogen is defined as any disease-producing agent such as,for example, a bacterium, a virus, a microorganism, a fungus, a prion,and a parasite. The antibody may have an affinity for the pathogen orpathogen associated matter. The antibody may have an affinity for a cellmarker or markers associated with a pathogen-borne condition. The markeror markers may be selected such that they represent a viable target oninfected cells. For a pathogen-borne condition, the antibody may beselected to target the pathogen itself. For a bacterial condition, apredetermined target may be the bacterium itself, for example,Escherichia cell or Bacillus anthracis. For a viral condition, apredetermined target may be the virus itself, for example,Cytomegalovirus (CMV), Epstein-Barr virus (EBV), a hepatitis virus, suchas Hepatitis B virus, human immunodeficiency virus, such as HIV, HIV-1,or HIV-2, or a herpes virus, such as Herpes virus 6. For a parasiticcondition, a predetermined target may be the parasite itself, forexample, Trypanasoma cruzi, Kinetoplastid, Schistosoma mansoni,Schistosoma japonicum, or Schistosoma brucel. For a fungal condition, apredetermined target may be the fungus itself, for example, Aspergillus,Candida, Cryptococcus neoformans, or Rhizomucor.

In another embodiment, the antibody may be targeted to a predeterminedtarget associated with an undesirable target. The particular target andantibody may be specific to, but not limited to, the type of theundesirable target. An undesirable target is a target that may beassociated with a disease or an undesirable condition, but also presentin the normal condition. For example, the target may be present atelevated concentrations or otherwise be altered in the disease orundesirable state. Antibody may have an affinity for the undesirabletarget or for biological molecular pathways related to the undesirabletarget. Antibody may have an affinity for a cell marker or markersassociated with the undesirable target. For an undesirable target, thechoice of a predetermined target may be important to therapy utilizingthe compounds according to the present technology (e.g., the drug and/ortherapeutic moieties). The antibody may be selected to target biologicalmatter associated with a disease or undesirable condition. Forarteriosclerosis, a predetermined target may be, for example,apolipoprotein B on low density lipoprotein (LDL). For obesity, apredetermined marker or markers may be chosen from cell surface markerssuch as, for example, one of gastric inhibitory polypeptide receptor andCD36 antigen. Another undesirable predetermined target may be clottedblood. In another embodiment, the antibody may be targeted to apredetermined target associated with a reaction to an organ transplantedinto the patient. The particular target and antibody may be specific to,but not limited to, the type of organ transplant. The antibody may havean affinity for a biological molecule associated with a reaction to anorgan transplant. The antibody may have an affinity for a cell marker ormarkers associated with a reaction to an organ transplant. The marker ormarkers may be selected such that they represent a viable target on Tcells or B cells of the immune system. In another embodiment, theantibody may be targeted to a predetermined target associated with atoxin in the patient. A toxin is defined as any poison produced by anorganism including, but not limited to, bacterial toxins, plant toxins,insect toxin, animal toxins, and man-made toxins. The particular targetand antibody may be specific to, but not limited to, the type of toxin.The antibody may have an affinity for the toxin or a biological moleculeassociated with a reaction to the toxin. The antibody may have anaffinity for a cell marker or markers associated with a reaction to thetoxin. In another embodiment, the antibody may be targeted to apredetermined target associated with a hormone-related disease. Theparticular target and antibody may be specific to, but not limited to, aparticular hormone disease. The antibody may have an affinity for ahormone or a biological molecule associated with the hormone pathway.The antibody may have an affinity for a cell marker or markersassociated with the hormone disease. In another embodiment, the antibodymay be targeted to a predetermined target associated with non-cancerousdiseased tissue. The particular target and antibody may be specific to,but not limited to, a particular non-cancerous diseased tissue, such asnon-cancerous diseased deposits and precursor deposits. The antibody mayhave an affinity for a biological molecule associated with thenon-cancerous diseased tissue. The antibody may have an affinity for acell marker or markers associated with the non-cancerous diseasedtissue. In another embodiment, the antibody may be targeted to aproteinaceous pathogen. The particular target and antibody may bespecific to, but not limited to, a particular proteinaceous pathogen.The antibody may have an affinity for a proteinaceous pathogen or abiological molecule associated with the proteinaceous pathogen. Theantibody may have an affinity for a cell marker or markers associatedwith the proteinaceous pathogen. For prion diseases, also known astransmissible spongiform encephalopathies, a predetermined target maybe, for example, Prion protein 3F4.

See, e.g., U.S. Pat. Appl. Pub. No. 20050090732 (in particular Table I),incorporated herein by reference for a list of targets, cell-specificmarkers (e.g., antigens for targeting with an antibody moiety),antibodies, and indications associated with those targets, cell-specificmarkers, and antigens/antibodies.

In some embodiments, the technology finds use in imaging, such as for insitu hybridization (ISH). In some embodiments, the nucleotide analogsprovided herein find use in nucleic acids that are hybridization probesfor ISH and fluorescence in situ hybridization (FISH). In someembodiments, the nucleotide analogs find use in direct ISH and/or forimmuno-histochemistry applications without using secondary detectionreagents.

7. Pharmaceutical Formulations

In some embodiments, nucleotide analogs, oligonucleotides comprising anucleotide analog, etc. are provided in a pharmaceutical formulation foradministration to a subject. It is generally contemplated that thecompounds (e.g., nucleotide analogs, oligonucleotides comprising anucleotide analog, conjugates of nucleotide analogs and/oroligonucleotides comprising a nucleotide analog, etc.) related to thetechnology are formulated for administration to a mammal, and especiallyto a human with a condition that is responsive to the administration ofsuch compounds. Therefore, where contemplated compounds are administeredin a pharmacological composition, it is contemplated that thecontemplated compounds are formulated in admixture with apharmaceutically acceptable carrier. For example, contemplated compoundscan be administered orally as pharmacologically acceptable salts, orintravenously in a physiological saline solution (e.g., buffered to a pHof about 7.2 to 7.5). Conventional buffers such as phosphates,bicarbonates, or citrates can be used for this purpose. Of course, oneof ordinary skill in the art may modify the formulations within theteachings of the specification to provide numerous formulations for aparticular route of administration. In particular, contemplatedcompounds may be modified to render them more soluble in water or othervehicle, which for example, may be easily accomplished with minormodifications (salt formulation, esterification, etc.) that are wellwithin the ordinary skill in the art. It is also well within theordinary skill of the art to modify the route of administration anddosage regimen of a particular compound to manage the pharmacokineticsof the present compounds for maximum beneficial effect in a patient.

In certain pharmaceutical dosage forms, prodrug forms of contemplatedcompounds may be formed for various purposes, including reduction oftoxicity, increasing the organ or target cell specificity, etc. Amongvarious prodrug forms, acylated (acetylated or other) derivatives,pyridine esters, and various salt forms of the present compounds arepreferred. One of ordinary skill in the art will recognize how to modifythe present compounds to prodrug forms to facilitate delivery of activecompounds to a target site within the host organism or patient. One ofordinary skill in the art will also take advantage of favorablepharmacokinetic parameters of the prodrug forms, where applicable, indelivering the present compounds to a targeted site within the hostorganism or patient to maximize the intended effect of the compound.Similarly, it should be appreciated that contemplated compounds may alsobe metabolized to their biologically active form, and all metabolites ofthe compounds herein are therefore specifically contemplated. Inaddition, contemplated compounds (and combinations thereof) may beadministered in combination with yet further agents.

With respect to administration to a subject, it is contemplated that thecompounds be administered in a pharmaceutically effective amount. One ofordinary skill recognizes that a pharmaceutically effective amountvaries depending on the therapeutic agent used, the subject's age,condition, and sex, and on the extent of the disease in the subject.Generally, the dosage should not be so large as to cause adverse sideeffects, such as hyperviscosity syndromes, pulmonary edema, congestiveheart failure, and the like. The dosage can also be adjusted by theindividual physician or veterinarian to achieve the desired therapeuticgoal.

As used herein, the actual amount encompassed by the term“pharmaceutically effective amount” will depend on the route ofadministration, the type of subject being treated, and the physicalcharacteristics of the specific subject under consideration. Thesefactors and their relationship to determining this amount are well knownto skilled practitioners in the medical, veterinary, and other relatedarts. This amount and the method of administration can be tailored tomaximize efficacy but will depend on such factors as weight, diet,concurrent medication, and other factors that those skilled in the artwill recognize.

Pharmaceutical compositions preferably comprise one or more compounds ofthe present technology associated with one or more pharmaceuticallyacceptable carriers, diluents, or excipients. Pharmaceuticallyacceptable carriers are known in the art such as those described in, forexample, Remingtons Pharmaceutical Sciences, Mack Publishing Co. (A. R.Gennaro edit. 1985), explicitly incorporated herein by reference for allpurposes.

Accordingly, in some embodiments, the immunotherapeutic agent isformulated as a tablet, a capsule, a time release tablet, a time releasecapsule; a time release pellet; a slow release tablet, a slow releasecapsule; a slow release pellet; a fast release tablet, a fast releasecapsule; a fast release pellet; a sublingual tablet; a gel capsule; amicroencapsulation; a transdermal delivery formulation; a transdermalgel; a transdermal patch; a sterile solution; a sterile solutionprepared for use as an intramuscular or subcutaneous injection, for useas a direct injection into a targeted site, or for intravenousadministration; a solution prepared for rectal administration; asolution prepared for administration through a gastric feeding tube orduodenal feeding tube; a suppository for rectal administration; a liquidfor oral consumption prepared as a solution or an elixir; a topicalcream; a gel; a lotion; a tincture; a syrup; an emulsion; or asuspension.

In some embodiments, the time release formulation is asustained-release, sustained-action, extended-release,controlled-release, modified release, or continuous-release mechanism,e.g., the composition is formulated to dissolve quickly, slowly, or atany appropriate rate of release of the compound over time.

In some embodiments, the compositions are formulated so that the activeingredient is embedded in a matrix of an insoluble substance (e.g.,various acrylics, chitin) such that the dissolving compound finds itsway out through the holes in the matrix, e.g., by diffusion. In someembodiments, the formulation is enclosed in a polymer-based tablet witha laser-drilled hole on one side and a porous membrane on the otherside. Stomach acids push through the porous membrane, thereby pushingthe drug out through the laser-drilled hole. In time, the entire drugdose releases into the system while the polymer container remainsintact, to be excreted later through normal digestion. In somesustained-release formulations, the compound dissolves into the matrixand the matrix physically swells to form a gel, allowing the compound toexit through the gel's outer surface. In some embodiments, theformulations are in a micro-encapsulated form, e.g., which is used insome embodiments to produce a complex dissolution profile. For example,by coating the compound around an inert core and layering it withinsoluble substances to form a microsphere, some embodiments providemore consistent and replicable dissolution rates in a convenient formatthat is combined in particular embodiments with other controlled (e.g.,instant) release pharmaceutical ingredients, e.g., to provide amultipart gel capsule.

In some embodiments, the pharmaceutical preparations and/or formulationsof the technology are provided in particles. “Particles” as used hereinin a pharmaceutical context means nano- or microparticles (or in someinstances larger) that can consist in whole or in part of the compoundsas described herein. The particles may contain the preparations and/orformulations in a core surrounded by a coating, including, but notlimited to, an enteric coating. The preparations and/or formulationsalso may be dispersed throughout the particles. The preparations and/orformulations also may be adsorbed into the particles. The particles maybe of any order release kinetics, including zero order release, firstorder release, second order release, delayed release, sustained release,immediate release, and any combination thereof, etc. The particle mayinclude, in addition to the preparations and/or formulations, any ofthose materials routinely used in the art of pharmacy and medicine,including, but not limited to, erodible, nonerodible, biodegradable, ornonbiodegradable materials or combinations thereof. The particles may bemicrocapsules which contain the formulation in a solution or in asemi-solid state. The particles may be of virtually any shape.

Both non-biodegradable and biodegradable polymeric materials can be usedin the manufacture of particles for delivering the preparations and/orformulations. Such polymers may be natural or synthetic polymers. Thepolymer is selected based on the period of time over which release isdesired. Bioadhesive polymers of particular interest include bioerodiblehydrogels described by H. S. Sawhney, C. P. Pathak and J. A. Hubell inMacromolecules, (1993) 26: 581-587, the teachings of which areincorporated herein by reference. These include polyhyaluronic acids,casein, gelatin, glutin, polyanhydrides, polyacrylic acid, alginate,chitosan, poly(methyl methacrylates), poly(ethyl methacrylates),poly(butylmethacrylate), poly (isobutyl methacrylate),poly(hexylmethacrylate), poly(isodecyl methacrylate), poly(laurylmethacrylate), poly(phenylmethacrylate), poly(methyl acrylate),poly(isopropyl acrylate), poly(isobutyl acrylate), and poly(octadecylacrylate).

The technology also provides methods for preparing stable pharmaceuticalpreparations containing aqueous solutions of the compounds or saltsthereof to inhibit formation of degradation products. A solution isprovided that contains the compound or salts thereof and at least oneinhibiting agent. The solution is processed under at least onesterilization technique prior to and/or after terminal filling thesolution in the sealable container to form a stable pharmaceuticalpreparation. The present formulations may be prepared by various methodsknown in the art so long as the formulation is substantially homogenous,e.g., the pharmaceutical is distributed substantially uniformly withinthe formulation. Such uniform distribution facilitates control over drugrelease from the formulation.

In some embodiments, the compound is formulated with a buffering agent.The buffering agent may be any pharmaceutically acceptable bufferingagent. Buffer systems include citrate buffers, acetate buffers, boratebuffers, and phosphate buffers. Examples of buffers include citric acid,sodium citrate, sodium acetate, acetic acid, sodium phosphate andphosphoric acid, sodium ascorbate, tartartic acid, maleic acid, glycine,sodium lactate, lactic acid, ascorbic acid, imidazole, sodiumbicarbonate and carbonic acid, sodium succinate and succinic acid,histidine, and sodium benzoate and benzoic acid.

In some embodiments, the compound is formulated with a chelating agent.The chelating agent may be any pharmaceutically acceptable chelatingagent. Chelating agents include ethylenediaminetetraacetic acid (alsosynonymous with EDTA, edetic acid, versene acid, and sequestrene), andEDTA derivatives, such as dipotassium edetate, disodium edetate, edetatecalcium disodium, sodium edetate, trisodium edetate, and potassiumedetate. Other chelating agents include citric acid and derivativesthereof. Citric acid also is known as citric acid monohydrate.Derivatives of citric acid include anhydrous citric acid andtrisodiumcitrate-dihydrate. Still other chelating agents includeniacinamide and derivatives thereof and sodium desoxycholate andderivatives thereof.

In some embodiments, the compound is formulated with an antioxidant. Theantioxidant may be any pharmaceutically acceptable antioxidant.Antioxidants are well known to those of ordinary skill in the art andinclude materials such as ascorbic acid, ascorbic acid derivatives(e.g., ascorbylpalmitate, ascorbylstearate, sodium ascorb ate, calciumascorbate, etc.), butylated hydroxy anisole, buylated hydroxy toluene,alkylgallate, sodium meta-bisulfate, sodium bisulfate, sodiumdithionite, sodium thioglycollic acid, sodium formaldehyde sulfoxylate,tocopherol and derivatives thereof, (d-alpha tocopherol, d-alphatocopherol acetate, dl-alpha tocopherol acetate, d-alpha tocopherolsuccinate, beta tocopherol, delta tocopherol, gamma tocopherol, andd-alpha tocopherol polyoxyethylene glycol 1000 succinate)monothioglycerol, and sodium sulfite. Such materials are typically addedin ranges from 0.01 to 2.0%.

In some embodiments, the compound is formulated with a cryoprotectant.The cryoprotecting agent may be any pharmaceutically acceptablecryoprotecting agent. Common cryoprotecting agents include histidine,polyethylene glycol, polyvinyl pyrrolidine, lactose, sucrose, mannitol,and polyols.

In some embodiments, the compound is formulated with an isotonicityagent. The isotonicity agent can be any pharmaceutically acceptableisotonicity agent. This term is used in the art interchangeably withiso-osmotic agent, and is known as a compound which is added to thepharmaceutical preparation to increase the osmotic pressure, e.g., insome embodiments to that of 0.9% sodium chloride solution, which isiso-osmotic with human extracellular fluids, such as plasma. Preferredisotonicity agents are sodium chloride, mannitol, sorbitol, lactose,dextrose and glycerol.

The pharmaceutical preparation may optionally comprise a preservative.Common preservatives include those selected from the group consisting ofchlorobutanol, parabens, thimerosol, benzyl alcohol, and phenol.Suitable preservatives include but are not limited to: chlorobutanol(0.3-0.9% w/v), parabens (0.01-5.0%), thimerosal (0.004-0.2%), benzylalcohol (0.5-5%), phenol (0.1-1.0%), and the like.

In some embodiments, the compound is formulated with a humectant toprovide a pleasant mouth-feel in oral applications. Humectants known inthe art include cholesterol, fatty acids, glycerin, lauric acid,magnesium stearate, pentaerythritol, and propylene glycol.

In some embodiments, an emulsifying agent is included in theformulations, for example, to ensure complete dissolution of allexcipients, especially hydrophobic components such as benzyl alcohol.Many emulsifiers are known in the art, e.g., polysorbate 60.

For some embodiments related to oral administration, it may be desirableto add a pharmaceutically acceptable flavoring agent and/or sweetener.Compounds such as saccharin, glycerin, simple syrup, and sorbitol areuseful as sweeteners.

8. Administration, Treatments, and Dosing

In some embodiments, the technology relates to methods of providing adosage of a nucleotide analog, oligonucleotide comprising a nucleotideanalog, or a conjugate thereof (e.g., comprising a targeting moiety,contrast agent, label, tag, etc.) to a subject. In some embodiments, acompound, a derivative thereof, or a pharmaceutically acceptable saltthereof, is administered in a pharmaceutically effective amount. In someembodiments, a compound, a derivative thereof, or a pharmaceuticallyacceptable salt thereof, is administered in a therapeutically effectivedose.

The dosage amount and frequency are selected to create an effectivelevel of the compound without substantially harmful effects. Whenadministered orally or intravenously, the dosage of the compound orrelated compounds will generally range from 0.001 to 10,000 mg/kg/day ordose (e.g., 0.01 to 1000 mg/kg/day or dose; 0.1 to 100 mg/kg/day ordose).

Methods of administering a pharmaceutically effective amount include,without limitation, administration in parenteral, oral, intraperitoneal,intranasal, topical, sublingual, rectal, and vaginal forms. Parenteralroutes of administration include, for example, subcutaneous,intravenous, intramuscular, intrastemal injection, and infusion routes.In some embodiments, the compound, a derivative thereof, or apharmaceutically acceptable salt thereof, is administered orally.

In some embodiments, a single dose of a compound or a related compoundis administered to a subject. In other embodiments, multiple doses areadministered over two or more time points, separated by hours, days,weeks, etc. In some embodiments, compounds are administered over a longperiod of time (e.g., chronically), for example, for a period of monthsor years (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or more months oryears).

In such embodiments, compounds may be taken on a regular scheduled basis(e.g., daily, weekly, etc.) for the duration of the extended period.

The technology also relates to methods of treating a subject with a drugappropriate for the subject's malady. According to another aspect of thetechnology, a method is provided for treating a subject in need of suchtreatment with an effective amount of a compound or a salt thereof. Themethod involves administering to the subject an effective amount of acompound or a salt thereof in any one of the pharmaceutical preparationsdescribed above, detailed herein, and/or set forth in the claims. Thesubject can be any subject in need of such treatment. In the foregoingdescription, the technology is in connection with a compound or saltsthereof. Such salts include, but are not limited to, bromide salts,chloride salts, iodide salts, carbonate salts, and sulfate salts. Itshould be understood, however, that the compound is a member of a classof compounds and the technology is intended to embrace pharmaceuticalpreparations, methods, and kits containing related derivatives withinthis class. Another aspect of the technology then embraces the foregoingsummary but read in each aspect as if any such derivative is substitutedwherever “compound” appears.

In some embodiments, a subject is tested to assess the presence, theabsence, or the level of a malady and/or a condition. Such testing isperformed, e.g., by assaying or measuring a biomarker, a metabolite, aphysical symptom, an indication, etc., to determine the risk of or thepresence of the malady or condition. In some embodiments, the subject istreated with a compound based on the outcome of the test. In someembodiments, a subject is treated, a sample is obtained and the level ofdetectable agent is measured, and then the subject is treated againbased on the level of detectable agent that was measured. In someembodiments, a subject is treated, a sample is obtained and the level ofdetectable agent is measured, the subject is treated again based on thelevel of detectable agent that was measured, and then another sample isobtained and the level of detectable agent is measured. In someembodiments, other tests (e.g., not based on measuring the level ofdetectable agent) are also used at various stages, e.g., before theinitial treatment as a guide for the initial dose. In some embodiments,a subsequent treatment is adjusted based on a test result, e.g., thedosage amount, dosage schedule, identity of the drug, etc. is changed.In some embodiments, a patient is tested, treated, and then tested againto monitor the response to therapy and/or change the therapy. In someembodiments, cycles of testing and treatment may occur withoutlimitation to the pattern of testing and treating, the periodicity, orthe duration of the interval between each testing and treatment phase.As such, the technology contemplates various combinations of testing andtreating without limitation, e.g., test/treat, treat/test,test/treat/test, treat/test/treat, test/treat/test/treat,test/treat/test/treat/test, test/treat/test/test/treat/treat/treat/test,treat/treat/test/treat, test/treat/treat/test/treat/treat, etc.

Although the disclosure herein refers to certain illustratedembodiments, it is to be understood that these embodiments are presentedby way of example and not by way of limitation.

EXAMPLES Example 1—Characterization of Nucleotide Analogs

During the development of embodiments of the technology provided herein,nucleotide analogs were characterized by analytical chemical methods. Inparticular, 3′-O-propargyl-dATP, 3′-O-propargyl-dCTP,3′-O-propargyl-dGTP, and 3′-O-propargyl-dTTP were synthesized accordingto the synthetic schemes described herein and characterized by ¹H NMR,³¹P NMR, anion exchange HPLC, and high-resolution mass spectrometry. Theanalytical testing indicated that the synthesis and purification weresuccessful (Figures X-Y).

Example 2—Assays for Identifying Compatible Polymerases

In some embodiments, the technology is related to the incorporation ofnucleotide analogs into a nucleic acid. Accordingly, the technologyprovides assays for identifying polymerases that recognize nucleotideanalogs (e.g., 3′-O-propargyl-dNTP) as substrates. For example, in someexemplary assays and/or embodiments, compatible polymerases areidentified by a polymerase extension reaction (e.g., a single baseextension reaction). See, e.g., Ausebel et al. (eds.), Current Protocolsin Molecular Biology. New York: John Wiley & Sons, Inc; Sambrook et al.(1989). Molecular Cloning: A Laboratory Manual. (2nd ed.). Cold SpringHarbor: Cold Spring Harbor Laboratory Press.

For example, identifying compatible polymerases comprises providing apolymerase to test and a reaction buffer appropriate for the polymerase.For polymerases obtained from a commercial supplier (e.g., New EnglandBioLabs, United States Biologicals, Promega, Invitrogen, Worthington,Sigma-Aldrich, Fluka, Finnzymes, Roche, 5 Prime, Qiagen, KAPABiosystems, Thermo Scientific, Agilent, Life Technologies, etc.), thepolymerase is often supplied with an appropriate reaction buffer. Anexemplary reaction buffer comprises, e.g., a compatible buffer (e.g., 20mM Tris-HCl), a salt (e.g., 10 mM KCl), a source of magnesium ormanganese (e.g., 2 mM MgSO₄; 2 mM MnCl₂, etc.), a detergent (e.g., 0.1%TRITON X-100) and has a suitable pH (e.g., approximately pH 8.8 atapproximately 25° C.). The activities of some polymerases are improvedin the presence of other compounds, such as sulfate and other salts(e.g., 10 mM (NH₄)₂SO₄). Reaction mixtures for polymerase extensionreactions are typically tested using Mg²⁺ or Mn²⁺ as the enzymecofactor.

Polymerases are tested by providing in the reaction mixture a DNAtemplate, a DNA primer that is complementary to the DNA template, andone or more nucleotides and/or nucleotide analogs. Typicalconcentrations of template and primer are approximately from 1 to 100 nMand typical concentrations of nucleotides and/or nucleotide analogs areapproximately from 1 to 125 μM (e.g., 1 to 125 μM for each nucleotideand/or nucleotide analog and/or 1 to 500 μM total concentration of allnucleotides and/or nucleotide analogs). Templates and primers aresynthesized by methods known in the art (e.g., using solid supports andphosphoramidite chemistry) and are available from several commercialsuppliers (e.g., Integrated DNA Technologies, Coralville, Iowa).

A pre-annealed primer/template is typically produced for testingpolymerases. For example, the primer is typically resuspended in asuitable buffer (e.g., Tris-EDTA, pH 8.0) at a suitable concentration,e.g., at 1 to 500 μM (e.g., at 100 μM) and the template is typicallyresuspended in a suitable buffer (e.g., Tris-EDTA, pH 8.0) at a suitableconcentration, e.g., at 1 to 500 μM (e.g., at 100 μM). Then, apre-annealed primer/template is produced by mixing approximately equalamounts of the primer and template in an annealing buffer. For example,a pre-annealed primer/template is produced by mixing approximately 100p1 of the approximately 1 to 500 μM (e.g., at 100 μM) primer solution toapproximately 100 μl of the approximately 1 to 500 μM (e.g., at 100 μM)template solution in approximately 800 μl of an annealing buffer (e.g.,200 mM Tris, 100 mM potassium chloride, and 0.1 mM EDTA, pH 8.45) toprovide a milliliter of primer/template solution. One of skill in theart can scale the volumes and concentrations as appropriate for theconcentrations and volumes that are appropriate for the particularanalysis. Then, an aliquot (e.g., 100 μl) of the primer/templatesolution is heated to denature intramolecular and/or intermolecularsecondary structures (e.g., by heating at approximately 85° C. to 97° C.(e.g., at approximately 95° C.), e.g., for 1 to 5 minutes (e.g., 2minutes). Next, the aliquot is cooled to an annealing temperature (e.g.,20° C. to 60° C. (e.g., 25° C.) and incubated for 1 to 10 minutes (e.g.,for approximately 5 minutes) to allow the primer and template to annealto form a primer/template. The primer/template can be diluted in anappropriate substrate dilution buffer (e.g., 20 mM Tris, 10 mM potassiumchloride, and 0.01 mM EDTA, pH 8.45; e.g., a 1 to 10 dilution of theannealing buffer described above) for storage. For example, theprimer/template can be diluted to a final concentration of 0.01 μM(e.g., to provide a 10x stock) in the substrate dilution buffer,aliquoted, and stored at −20° C.

Software packages are known in the art that provide assistance indesigning templates and primers for these assays. In addition, severalequations are available for calculating denaturation (e.g., melting (Tm)temperatures) and annealing temperatures. Standard references describe asimple estimate of the Tm value that may be calculated by the equation:T_(m)=81.5+0.41*(% G+C), when a nucleic acid is in aqueous solution at 1M NaCl (see e.g., Anderson and Young, Quantitative Filter Hybridization,in Nucleic Acid Hybridization (1985). Other references (e.g., Allawi andSantaLucia, Biochemistry 36: 10581-94 (1997) include more sophisticatedcomputations that account for structural, environmental, and sequencecharacteristics.

The primer and template or pre-annealed template is/are used to test thepolymerase. For example, primer extension assays are conducted with 1 to100 nM (e.g., 50 nM) of the primer/template, dNTPs (e.g., a mixture of 1to 125 μM of each dATP, dCTP, dGTP, dTTP, modified dATP (e.g.,3′-O-propargyl-dATP), modified dCTP (e.g., 3′-O-propargyl-dCTP),modified dGTP (e.g., 3′-O-propargyl-dGTP), and/or modified dTTP (e.g.,3′-O-propargyl-dTTP)), and polymerase (e.g., 1 to 100 U ofthermostable/thermophilic polymerase or mesophilic polymerase) in afinal volume of approximately 1 to 100 μl (e.g., 10 to 20 μl) containingan appropriate buffer (e.g., as provided by the commercial supplier ofthe polymerase or the exemplary reaction buffer as described above). Insome assays, the reaction mixture comprises 1 to 50 U (e.g., 1 U) ofthermostable pyrophosphatase.

In some assays, a mixture of dNTPs and modified dNTPs is used. Forexample, some assays test the incorporation of a single base into anucleic acid (e.g., in a single base extension assay). In such an assay,the primer hybridizes to a complementary region in the template to forma duplex such that the primer's terminal 3′ end is directly adjacent tothe base pairing partner of the nucleotide analog to be tested. In asuccessful test, the candidate polymerase being tested incorporates asingle nucleotide analog at the 3′ end of the primer. Many approachesare available for detecting the incorporation of the nucleotide analog,including fluorescence labeling, mass labeling for mass spectrometry,measuring enzyme activity using a protein moiety, and isotope labeling.

In particular, the assay tests the incorporation of a modifiednucleotide (e.g., 3′-O-propargyl-dNTP) to the 3′ end of the primer asdirected by the template. In such an assay, the reaction mixture cancontain three dNTPs and the one particular modified dNTP that is addedto the 3′ end of the primer as directed by the template. Some assayscomprise the use of four individual reaction mixtures comprising each ofthe four primers annealed to a template (e.g., four primer/templates)designed such that each of the four modified nucleotides is to be addedto the 3′ end of the primer as directed by the template. For example, insome embodiments a primer/template is provided to test incorporation ofa modified dATP (e.g., 3′-O-propargyl-dATP):

NNNNNNNNNNNNNNN ||||||||||||||| NNNNNNNNNNNNNNNTNNNNNNNNNNNNNNNNNin which N is any nucleotide and “|” indicates complementary basepairing between the exemplary primer (top strand) and exemplary template(bottom strand). In some embodiments a primer/template is provided totest incorporation of a modified dCTP (e.g., 3′-O-propargyl-dCTP):

NNNNNNNNNNNNNNN ||||||||||||||| NNNNNNNNNNNNNNNGNNNNNNNNNNNNNNNNNin which N is any nucleotide and “|” indicates complementary basepairing between the exemplary primer (top strand) and exemplary template(bottom strand). In some embodiments a primer/template is provided totest incorporation of a modified dGTP (e.g., 3′-O-propargyl-dGTP):

NNNNNNNNNNNNNNN ||||||||||||||| NNNNNNNNNNNNNNNCNNNNNNNNNNNNNNNNNin which N is any nucleotide and “|” indicates complementary basepairing between the exemplary primer (top strand) and exemplary template(bottom strand). In some embodiments a primer/template is provided totest incorporation of a modified dTTP (e.g., 3′-O-propargyl-dTTP):

NNNNNNNNNNNNNNN ||||||||||||||| NNNNNNNNNNNNNNNANNNNNNNNNNNNNNNNNin which N is any nucleotide and “|” indicates complementary basepairing between the exemplary primer (top strand) and exemplary template(bottom strand). The primers and templates can be any appropriate lengthfor the assay and the position of the single-base extension can bedirected by any appropriate nucleotide of the template, usually withinthe central portion of the template.

The polymerase is tested in the reaction mixture at a temperatureappropriate for the polymerase. For example, a mesophilic polymerase istested at a temperature of from 20° C. to 60° C. and a thermophilicpolymerase is tested at a temperature from 80° C. to 97° C. or more(e.g., 100° C. or more). Appropriate temperatures are indicated by theliterature accompanying commercially supplied polymerases; appropriatetemperatures for other (e.g., any) polymerase can be determined by oneof skill in the art by testing polymerase activity with standardnucleotides over a range of temperatures.

In some assays, the temperature is cycled between a temperature todenature nucleic acids (e.g., a melting temperature) of approximately85° C. to 97° C. (e.g., at approximately 95° C.) for 1 to 5 minutes(e.g., 2 minutes), an annealing temperature of approximately 40° C. to70° C. (e.g., at 55° C.) for 5 to 60 seconds (e.g., 15 to 20 seconds),and an extension temperature of approximately 60 to 75° C. (e.g., 70 to75° C.) for 15 to 60 seconds (e.g., 20 to 45 seconds), e.g., for 20 to50 cycles.

Successful incorporation of a modified nucleotide (e.g., a3′-O-propargyl dNTP) is determined by any number of methods. In someparticular assays, the size of the reaction product is quantified todetermine if the modified nucleotide (e.g., a 3′-O-propargyl dNTP) wasadded to the primer. In particular, the product of a successfulincorporation is one base pair longer than the known length of theprimer. The primer can be assayed as a negative control sample forcomparison. Also, a synthetic positive control oligonucleotide havingthe length and structure of a reaction product expected from asuccessful incorporation can be assessed. Any method of discriminatingbetween nucleic acids that differ by one base is appropriate for theassay, e.g., gel electrophoresis (e.g., Agilent Bioanalyzer), massspectrometry, HPLC, etc.

Example 3—Polymerase Screening

During the development of embodiments of the technology provided herein,experiments were conducted to identify polymerases that can efficientlyincorporate 3′-O-propargyl-dNTP as substrates. In particular,embodiments of the exemplary nucleotide extension assays described inExample 2 were used to test multiple polymerase enzymes including thosesold under the trade names Ampli-Taq (Life Technologies), KAPA HiFi(KAPA Biosystems), KAPA 2G (KAPA Biosystems), Herculase II Fusion DNApolymerase (Agilent), PfuUltra II Fusion HS DNA polymerase (Agilent),Phire HS II DNA polymerase (Thermo Scientific), M-MuLV ReverseTranscriptase (NEB), rTth DNA polymerase, 9° N DNA Polymerase (NEB),THERMINATOR I DNA Polymerase (NEB), THERMINATOR II DNA polymerase (NEB),and 5 additional custom, non-catalog polymerases from NEB. Reactionconditions recommended by the commercial suppliers were followed for allpolymerases tested. Tests of each polymerase were performed using bothMg²⁺ and Mn²⁺ as the co-factor in the reaction mixture.

The data collected indicated that the polymerases derived fromThermococcus sp. (e.g., Thermococcus sp. 9° N (e.g., THERMINATOR I andTHERMINATOR II)) incorporated the 3′-O-propargyl dNTPs provided hereininto a nucleic acid (Table 1).

TABLE 1 Summary of polymerase screening Amplitaq KAPA KAPA HerculasePfuUltra Phire M- co-factor Gold HiFi 2 G II Fusion II Fusion HS II MuLVrTth 9° N Mg²⁺ − − − − − − − − − Mn²⁺ − − − − − − − − + TherminatorTherminator NEB NEB NEB NEB NEB co-factor I II 1 2 3 4 5 Mg²⁺ − + − − −− − Mn²⁺ − +++ − − − − −

In Table 1, a minus (“−”) indicates that the polymerase did not producea detectable product incorporating the 3′-O-propargyl dNTP, a singleplus (“+”) indicates that the polymerase produced a detectable productincorporating the 3′-O-propargyl dNTP, and three plusses (“+++”)indicates that the polymerase produced a substantial amount of3′-O-propargyl dNTP incorporation product. NEB1, NEB2, NEB3, NEB4, andNEB5 indicate each of the five non-commercial New England BioLabspolymerases tested.

It is to be understood that assays (e.g., as described herein, asdescribed elsewhere, and as are known in the art) are available toidentify any polymerases that incorporate modified nucleotides (e.g.,the 3′-O-propargyl dNTPs provided herein) into a nucleic acid.Accordingly, the technology is not limited by the use of theThermococcus sp. 9° N (THERMINATOR I and THERMINATOR II) polymerases andcontemplates the use of any appropriate extant or yet to be discoveredpolymerase that incorporates the modified nucleotides (e.g., the3′-O-propargyl dNTPs provided herein) into a nucleic acid. Experimentsdescribed herein using the Thermococcus sp. 9° N (THERMINATOR II)polymerases are exemplary and do not limit the technology to the use ofany particular polymerase.

Example 4—Incorporation of 3′-O-propargyl-dNTP into a Nucleic Acid

During the development of embodiments of the technology provided herein,experiments were conducted to assess the incorporation of3′-O-propargyl-dNTPs into a nucleic acid by a polymerase. In particular,experiments were conducted to evaluate the accurate incorporation of3′-O-propargyl-dNTPs into a nucleic acid and to evaluate the terminatingactivity of the 3′-O-propargyl-dNTPs. To assess these characteristics ofthe nucleotide analogs provided herein, polymerase extension assays wereconducted using a template nucleic acid having a sequence from humanKRAS (e.g., KRAS exon 2 and flanking intron sequences) and acomplementary primer (Table 2).

TABLE 2 template & primer sequences used to test incorporationof 3′-O-propargyl-dNTP length SEQ Name Sequence (5′ to 3′) (bases)ID NO: KRAS template TTATTATAAGGCCTGCTGAAAATGACTGAA 177 1TATAAACTTGTGGTAGTTGGAGCTGGTGGC GTAGGCAAGAGTGCCTTGACGATACAGCTAATTCAGAATCATTTTGTGGACGAATATGAT CCAACAATAGAGGTAAATCTTGTTTTAATATGCATATTACTGGTGCAGGACCATTCT R_ke2_trP1_T_biobTAAUCCTCTCTATGGGCAGTCGGTGATAG 48 2 AATGGTCCTGCACCAGTAA R_ke2_trP1_A_biobTAAUCCTCTCTATGGGCAGTCGGTGATAG 49 3 AATGGTCCTGCACCAGTAATR_ke2_trP1_G_bio bTAAUCCTCTCTATGGGCAGTCGGTGATAG 51 4AATGGTCCTGCACCAGTAATAT R_ke2_trP1_C_bio bTAAUCCTCTCTATGGGCAGTCGGTGATAG52 5 AATGGTCCTGCACCAGTAATATGIn Table 2, a “b” indicates a biotin modification and a “U” indicates adeoxyuridine modification. Incorporation of the primers into extensionproducts produces extension products comprising a uracil. The uracil isuseful, e.g., for cleavage of the product (e.g., using uracil cleavagereagents) in a number of molecular biological manipulations (e.g.,cleaving the product from a solid support).

To test incorporation of a 3′-O-propargyl-dTTP into a nucleic acid, apolymerase extension reaction mix was assembled comprising 20 mMTris-HCl, 10 mM (NH₄)SO₄, 10 mM KCl, 2 mM MnCl₂, 0.1% Triton X-100, 200pmol 3′-O-propargyl-dTTP, 6.25 pmol of primer R_ke2_trP1_T_bio (SEQ IDNO: 2), and 2 units of Therminator II DNA polymerase (New EnglandBioLabs) in a 25-μl final reaction volume. A volume of 0.5 pmol of theKRAS template (SEQ ID NO: 1) was used as template (Table 2). Thepolymerase extension reaction was performed using a temperature cyclingprofile comprising exposing the reaction to a temperature of 95° C. for2 minutes followed by 35 cycles of 95° C. for 15 seconds, 55° C. for 25seconds, and 65° C. for 35 seconds.

After the polymerase extension reaction, 1 μl of the reaction mix wasused directly for nucleic acid size analysis by gel electrophoresis(e.g., using an Agilent 2100 Bioanalyzer and High Sensitivity DNA AssayChip). Data collected from size analysis showed the presence of apopulation of nucleic acids having a length corresponding to the lengthof the primer used in the reaction (e.g., 48 bases) and a population ofnucleic acids having a length corresponding to the length of the primerplus one base (e.g., 49 bases). Accordingly, the data collectedindicated the successful incorporation of the 3′-O-propargyl-dTTP at the3′ end of the primer. Further, the amounts of the two populations ofnucleic acids were approximately equal, thus indicating the robustincorporation of the 3′-O-propargyl-dTTP at the 3′ end of the primer toform the extension product.

Additional polymerase extension experiments were performed using thereaction conditions described above and replacing the3′-O-propargyl-dTTP and the primer R_ke2_trP1_T_bio with3′-O-propargyl-dATP and the primer R_ke2_trP1_A_bio (SEQ ID NO: 3);3′-O-propargyl-dCTP and the primer R_ke2_trPl_C_bio (SEQ ID NO: 5); and3′-O-propargyl dGTP and the primer R_ke2_trP1_G_bio (SEQ ID NO: 4). Thedata collected from these experiments similarly indicated the successfulincorporation of 3′-O-propargyl-dATP, 3′-O-propargyl-dCTP, and3′-O-propargyl-dGTP, respectively, at the 3′ end of the primers.

Example 5—Ladder Fragment Generation

During the development of embodiments of the technology provided herein,experiments were conducted to assess using the nucleotide analogs of thepresent technology to generate nucleic acid fragments that terminate atbase-specific positions. In particular, reaction mixtures were producedand tested that included both natural dNTPs and each of the3′-O-propargyl-dNTPs individually.

To test the fragment generation by 3′-O-propargyl-dTTP, a DNA fragmentgeneration reaction mix was prepared comprising 20 mM Tris-HCl, 10 mM(NH₄)SO₄, 10 mM KCl, 2 mM MnCl₂, 0.1% Triton X-100, 1000 pmol dATP, 1000pmol dCTP, 1000 pmol dGTP, 1000 pmol dTTP, 200 pmol 3′-O-propargyl-dTTP,6.25 pmol of primer R_ke2_trP1_T_bio (SEQ ID NO: 2), and 2 units ofTHERMINATOR II DNA polymerase (New England BioLabs) in a 25-μl finalreaction volume. A volume of 0.5 pmol of the KRAS template (SEQ IDNO: 1) was used as template. The polymerase extension reaction wasperformed using a temperature cycling profile comprising exposing thereaction to a temperature of 95° C. for 2 minutes followed by 50 cyclesof 95° C. for 15 seconds, 55° C. for 25 seconds, and 65° C. for 35seconds.

After the polymerase extension reaction, 1 μl of the reaction mix wasused directly for nucleic acid size analysis by gel electrophoresis(e.g., using an Agilent 2100 Bioanalyzer and High Sensitivity DNA AssayChip). Data collected from size analysis showed that the reactiongenerated a population of nucleic acid fragments having a range of sizescorresponding to the expected lengths of nucleic acids that arecomplementary to the template and terminated by 3′-O-propargyl-dT ateach position where termination is expected.

Additional polymerase extension experiments were performed using thereaction conditions described above and replacing the3′-O-propargyl-dTTP with 3′-O-propargyl-dATP, 3′-O-propargyl-dCTP, or3′-O-propargyl dGTP. The data collected from these experiments similarlyindicated that the reactions generated populations of nucleic acidfragments having a range of sizes corresponding to the expected lengthsof nucleic acids that are complementary to the template and terminatedby 3′-O-propargyl-dA, 3′-O-propargyl-dC, or 3′-O-propargyl-dG at eachposition where termination is expected.

Example 6—Synthesis of 5′-azido-methyl-modified Oligonucleotide

During the development of embodiments of the technology provided herein,an oligonucleotide comprising a 5′-azido-methyl modification wassynthesized and characterized. Synthesis of the modified oligonucleotidewas performed using phosphoramidite chemical synthesis. In the lastsynthetic step, phosphoramidite chemical synthesis was used toincorporate a 5′-iodo-dT phosphoramidite at the terminal 5′ position.The oligonucleotide attached to the solid support in the reaction columnwas then treated as follows.

First, sodium azide (30 mg) was resuspended in dry DMF (1 ml), heatedfor 3 hours at 55° C., and cooled to room temperature. The supernatantwas taken up with a 1-ml syringe and passed back and forth through thereaction column comprising the 5′-iodo-modified oligonucleotide andincubated overnight at ambient (room) temperature. After incubation, thecolumn was washed with dry DMF, washed with acetonitrile, and then driedvia argon gas. The resulting 5′-azido-methyl-modified oligonucleotidewas cleaved from the solid support and deprotected by heating in aqueousammonia for 5 hours at 55° C. The final product was an oligonucleotidehaving the sequence shown below:

(SEQ ID NO: 6) Az-TCTGAGTCGGAGACACGCAGGGATGAGATGGTThe “Az” indicates the azido-methyl modification at the 5′ end (e.g.,5′-azido-methyl modification), e.g., to provide an oligonucleotidehaving a structure according to:

where B is the base of the nucleotide (e.g., adenine, guanine, thymine,cytosine, or a natural or synthetic nucleobase, e.g., a modified purinesuch as hypoxanthine, xanthine, 7-methylguanine; a modified pyrimidinesuch as 5,6-dihydrouracil, 5-methylcytosine, 5-hydroxymethylcytosine;etc.).

Example 7—Conjugation of 5′-azido-methyl-modified oligonucleotide and3′-O-propargyl-Modified Nucleic Acid Fragments

During the development of embodiments of the technology provided herein,experiments were conducted to test the conjugation of a5′-azido-methyl-modified oligonucleotide (e.g., see Example 6) to3′-O-propargyl-modified nucleic acid fragments (e.g., see Example 5) byclick chemistry. In particular, experiments were conducted in which a5′-azido-methyl-modified oligonucleotide was chemically conjugated to3′-O-propargyl-modified DNA fragments using copper (I) catalyzed1,3-dipolar alkyne-azide cycloaddition chemistry (“click chemistry”).

Click chemistry was performed using commercially available reagents(baseclick GmbH, Oligo-Click-M Reload kit) according to themanufacturer's instructions. Briefly, approximately 0.1 pmol of3′-O-propargyl-modified DNA fragments comprising a 5′-biotinmodification were reacted with approximately 500 pmol of5′-azido-methyl-modified oligonucleotide (see, e.g., Example 6, e.g.,SEQ ID NO: 6) using the click chemistry reagent in a total volume of 10μI. The reaction mixture was incubated at 45° C. for 30 minutes.Following the incubation, the supernatant was transferred to a newmicrocentrifuge tube and a 40-μl volume of the commercially suppliedbinding and wash buffer (e.g., 1 M NaCl, 10 mM Tris-HCl, 1 mM EDTA, pH7.5) was added. The conjugated reaction product was isolated from theexcess 5′-azido-methyl-modified oligonucleotide by incubating the clickchemistry reaction mixture with streptavidin-coated magnetic beads(Dynabeads, MyOne Streptavidin C1, Life Technologies) at ambient (room)temperature for 15 minutes. The beads were separated from thesupernatant using a magnet and the supernatant was removed.Subsequently, the beads were washed twice using the binding and washbuffer and then resuspended in 25 μl of TE buffer (10 mM Tris-HCl, 0.1mM EDTA, pH approximately 8).

The product was cleaved from the solid support (bead) using uracilcleavage (Uracil Glycosylase and Endonuclease VIII, Enzymatics). Inparticular, uracil cleavage reagents were used to cleave the reactionproducts at the site of the deoxyuridine modification located near the5′-terminal location of the conjugated product (see SEQ ID NOs: 2-5).Finally, the supernatant comprising the conjugated product was purifiedusing Ampure XP (Beckman Coulter) following the manufacturer's protocoland eluted in 20 μl of TE buffer.

Example 8—Amplification of Conjugated Product

During the development of embodiments of the technology describedherein, experiments were performed to characterize the chemicalconjugation of the 5′-azido-methyl-modified oligonucleotide to the3′-O-propargyl modified nucleic acid fragments and to evaluate thetriazole linkage as a mimic of a natural phosphodiester bond in anucleic acid backbone. To test the ability of a polymerase to recognizethe conjugated product as a template and traverse the triazole linkageduring synthesis, PCR primers were designed to produce amplicons thatspan the triazole linkage of the conjugation products:

Primer 1 SEQ ID NO: 7 CCTCTCTATGGGCAGTCGGTGAT Primer 2 SEQ ID NO: 8CCATCTCATCCCTGCGTGTCTC

A commercially available PCR pre-mix (KAPA 2G HS, KAPA Biosystems) wasused to provide a 25-μl reaction mixture comprising, in addition tocomponents provided by the mix (e.g., buffer, polymerase, dNTPs), 0.25μM Primer 1 (SEQ ID NO: 7), 0.25 μM of Primer 2 (SEQ ID NO: 8), and 2 μlof purified conjugated product (see Example 7) as template foramplification. The reaction mixture was thermally cycled by incubatingthe sample at 95° C. for 5 minutes, followed by 30 cycles of 98° C. for20 seconds, 60° C. for 30 seconds, and 72° C. for 20 seconds. Theamplification products were analyzed by gel electrophoresis (e.g., usingan Agilent Bioanalyzer 2100 system and High-Sensitivity DNA Chip) todetermine the size distributions of the reaction products.

Analysis of the amplification products indicated that the amplificationreaction successfully produced amplicons using the conjugated productsof the click chemistry reaction (see Example 7) as templates foramplification. In particular, analysis of the amplification productsindicated that the polymerase processed along the template and throughthe triazole linkage to produce amplicons from the template. Further,the amplification produced a heterogeneous population of ampliconshaving a range of sizes corresponding to the expected sizes produced byamplification of the base-specific terminated DNA fragments viaincorporation of the 3′-O-propargyl-dNTP. The fragment analysis alsoshowed the proper fragment size increase corresponding to thirty one(31) additional bases from the conjugated 5′-azido-methyl-modifiedoligonucleotide.

Example 9—Ladder Generation using 3′-O-propargyl dNTP Termination

During the development of embodiments of the technology provided herein,experiments were conducted to assess the generation of terminatednucleic acid fragments in a reaction comprising a mixture of3′-O-propargyl-dNTPs and natural (standard) dNTPs. In particular,experiments were conducted to assess the generation of fragmentsterminated at each position within the target region by incorporation ofchain-terminating 3′-O-propargyl-dNTPs by DNA polymerase duringsynthesis.

Experiments were conducted using a mixture of natural dNTPs and all fourof the 3′-O-propargyl-dNTPs in a single reaction. The DNA fragmentgeneration reaction mix comprised 20 mM Tris-HCl, 10 mM (NH₄)SO₄, 10 mMKCl, 2 mM MnCl₂, 0.1% Triton X-100, 1000 pmol dATP, 1000 pmol dCTP, 1000pmol dGTP, 1000 pmol dTTP, 100 pmol of 3′-O-propargyl-dATP, 100 pmol of3′-O-propargyl-dCTP, 100 pmol of 3′-O-propargyl-dGTP, 100 pmol of3′-O-propargyl-dTTP, 6.25 pmol of primer R_ke2_trP1_T_bio (SEQ ID NO:2), and 2 units of Therminator II DNA polymerase (New England BioLabs)in a 25-μl reaction volume. 0.5 pmol of purified amplicon correspondingto a region in KRAS exon 2 (SEQ ID NO: 1) was used as template. Thepolymerase extension reaction was thermocycled by heating to 95° C. for2 minutes, followed by 45 cycles at 95° C. for 15 seconds, 55° C. for 25seconds, and 65° C. for 35 seconds.

After the polymerase extension reaction, 1 μI of the reaction mix wasused directly for DNA fragment size analysis using gel electrophoresis(Agilent 2100 Bioanalyzer and High Sensitivity DNA Assay Chip). Fragmentsize analysis of the reaction products indicated that the fragmentgeneration reaction successfully produced a ladder of nucleic acidfragments having the expected sizes.

Subsequently, a 5′-azido-methyl-modified oligonucleotide (see, e.g.,Example 6, e.g., SEQ ID NO: 6) was chemically conjugated to theterminated DNA fragments comprising 3′-O-propargyl-dN using clickchemistry as described in Example 6 and Example 7 above. After theconjugation, an amplification reaction was performed to amplify theconjugated products as described in Example 8. DNA fragment sizeanalysis of the amplicons produced from the conjugated products showedthe expected shift in size resulting from conjugation of the5′-azido-modified oligonucleotide to the amplicons produced from thefragment ladder.

Example 10—Control of Fragment Size

During the development of embodiments of the technology provided herein,experiments were conducted to control the size distribution ofterminated nucleic acid fragments produced in a reaction comprising amixture of 3′-O-propargyl-dNTPs and natural (standard) dNTPs byadjusting the ratio of 3′-O-propargyl-dNTPs to natural (standard) dNTPs.It was contemplated that the molar ratio of 3′-O-propargyl-dNTPs andnatural dNTPs affects the fragment size distribution due to competitionbetween the 3′-O-propargyl-dNTPs (that terminate extension) and naturaldNTPs (that elongate the polymerase product) for incorporation into thesynthesized nucleic acid by the polymerase.

Accordingly, experiments were performed in which the products offragment ladder generation reactions were assessed at various molarratios of 3′-O-propargyl-dNTPs to natural dNTPs. Fragment laddergeneration reactions were performed using 2:1, 10:1, and 100:1 molarratios of natural dNTPs to 3′-O-propargyl-dNTPs. The fragment generationreaction mixtures used in these experiments comprised 20 mM Tris-HCl, 10mM (NH₄)SO₄, 10 mM KCl, 2 mM MnCl₂, 0.1% Triton X-100, 1000 pmol dATP,1000 pmol dCTP, 1000 pmol dGTP, 1000 pmol dTTP, 6.25 pmol of primerR_ke2_trP1_T_bio (SEQ ID NO: 2), 2 units of Therminator II DNApolymerase (New England BioLabs), and 0.5 pmol of purified ampliconcorresponding to a region in KRAS exon 2 (SEQ ID NO: 1) as template in a25-μl final reaction volume.

In addition, reactions testing a 2:1 ratio of natural dNTPs to3′-O-propargyl-dNTPs comprised 500 pmol of 3′-O-propargyl-dATP, 500 pmolof 3′-O-propargyl-dCTP, 500 pmol of 3′-O-propargyl-dGTP, and 500 pmol of3′-O-propargyl-dTTP. Reactions testing a 10:1 ratio of natural dNTPs to3′-O-propargyl-dNTPs comprised 100 pmol of 3′-O-propargyl-dATP, 100 pmolof 3′-O-propargyl-dCTP, 100 pmol of 3′-O-propargyl-dGTP, and 100 pmol of3′-O-propargyl-dTTP. Reactions testing a 100:1 ratio of natural dNTPs to3′-O-propargyl-dNTPs comprised 10 pmol of 3′-O-propargyl-dATP, 10 pmolof 3′-O-propargyl-dCTP, 10 pmol of 3′-O-propargyl-dGTP, and 10 pmol of3′-O-propargyl-dTTP

The polymerase extension reactions were temperature cycled by incubatingat 95° C. for 2 minutes, followed by 45 cycles at 95° C. for 15 seconds,55° C. for 25 seconds, and 65° C. for 35 seconds. After the polymeraseextension reaction, 5′-azido-methyl-modified oligonucleotides (see,e.g., Example 6, e.g., SEQ ID NO: 6) were chemically conjugated to thenucleic acid fragments terminated with 3′-O-propargyl-dN using clickchemistry as described in Example 6 and Example 7. After theconjugation, the conjugation products were used as templates foramplification to produce amplicons corresponding to the conjugatedproducts as described in Example 8. Fragment size analysis was performedon the conjugated products.

Fragment size analysis of the amplified conjugation products producedfrom the products of the three different molar ratio conditionsindicated that the fragment size depended on the ratio of3′-O-propargyl-dNTPs to natural dNTPs. Analysis of the fragment sizesshows a in fragment size distribution shift as a function of the molarratios of dNTP to 3′-O-propargyl-dNTP. At the 2:1 molar ratio, largerpopulations of shorter fragments were detected compared to the other twomolar ratio conditions. At the 10:1 molar ratio, a larger fraction oflonger fragments was present relative to the 2:1 molar ratio. At the100:1 molar ratio, the major population of fragments comprised longerDNA fragments relative to the other two molar ratios.

All publications and patents mentioned in the above specification areherein incorporated by reference in their entirety for all purposes.Various modifications and variations of the described compositions,methods, and uses of the technology will be apparent to those skilled inthe art without departing from the scope and spirit of the technology asdescribed. Although the technology has been described in connection withspecific exemplary embodiments, it should be understood that theinvention as claimed should not be unduly limited to such specificembodiments. Indeed, various modifications of the described modes forcarrying out the invention that are obvious to those skilled in the artare intended to be within the scope of the following claims.

We claim:
 1. A composition comprising a nucleotide analog having astructure according to:

wherein B is a base and P comprises a phosphate moiety.
 2. Thecomposition of claim 1 wherein P comprises a tetraphosphate, atriphosphate, a diphosphate, a monophosphate, a modified tetraphosphate,a modified triphosphate, a modified diphosphate, or a modifiedmonophosphate.
 3. The composition of claim 1 wherein B is selected fromthe group consisting of cytosine, guanine, adenine, thymine, and uracil.4. The composition of claim 1 wherein B comprises a purine, apyrimidine, a modified purine, or a modified pyrimidine
 5. Thecomposition of claim 1 wherein the nucleotide analog comprises athio-alkynyl, thio-propargyl, 3′-S-propargyl, thiofuranose, thioribose,thiodeoxyribose, arabinose, or a modified sugar.
 6. The composition ofclaim 1 wherein P comprises a 5′ hydroxyl, an alpha thiophosphate, abeta thiophosphate, a gamma thiophosphate, an alpha methylphosphonate, abeta methylphosphonate, or a gamma methylphosphonate.
 7. The compositionof claim 1 further comprising a polymerase, a nucleic acid, or anucleotide.
 8. The composition of claim 1 wherein the nucleotide analogis modified with a sulfur.
 9. The composition of claim 1 furthercomprising a nucleotide comprising the base B, wherein the number ratioof the nucleotide analog to the nucleotide is 1:1, 1:2, 1:3, 1:4, 1:5,1:10, 1:15, 1:20, 1:25, 1:30, 1:50, 1:75, 1:100, 1:200, 1:300, 1:400,1:500, 1:600, 1:700, 1:800, 1:900, 1:1000, 1:5000 or 1:10000.
 10. Acomposition comprising a nucleic acid comprising the nucleotide analogof claim
 1. 11. The composition of claim 10 wherein the nucleotideanalog is at the 3′ end of the nucleic acid.
 12. The composition ofclaim 10 wherein the nucleic acid is produced by a polymerase.
 13. Thecomposition of claim 10 further comprising an azide.
 14. The compositionof claim 10 further comprising a second nucleic acid.
 15. Thecomposition of claim 10 further comprising a second nucleic acidcomprising an azide moiety.
 16. The composition of claim 10 furthercomprising a second nucleic acid comprising an azide moiety at the 5′end of the second nucleic acid.
 17. The composition of claim 10 furthercomprising a label comprising an azide, a tag comprising an azide, asolid support comprising an azide, a nucleotide comprising an azide, abiotin comprising an azide, or a protein comprising an azide.
 18. Thecomposition of claim 10 further comprising a copper-based catalystreagent.
 19. The composition of claim 10 further comprising a nucleicacid comprising a triazole.
 20. The composition of claim 10 furthercomprising a nucleic acid comprising a 1′, 4′ substituted triazole. 21.The composition of claim 10 further comprising an adaptoroligonucleotide, an adaptor oligonucleotide comprising a barcode, or abarcode oligonucleotide.
 22. The composition of claim 10 furthercomprising a nucleic acid comprising a structure according to:


23. A composition comprising a 3′O-propargyl nucleotide and a5′-oligonucleotide